×

Announcing: Slashdot Deals - Explore geek apps, games, gadgets and more. (what is this?)

Thank you!

We are sorry to see you leave - Beta is different and we value the time you took to try it out. Before you decide to go, please take a look at some value-adds for Beta and learn more about it. Thank you for reading Slashdot, and for making the site better!

Comments

top

US Senate Set To Vote On Whether Climate Change Is a Hoax

rockmuelle Re:Yep it is a scam (666 comments)

For the sake of this discussion, mosquito borne malaria is a warm weather problem. Increased deaths from cold weather, which was the parent's straw man, occur when it's really cold (sub freezing). Mosquitos die when it freezes. Sure, they can be a problem even when it's cold, but not when it's deadly cold.

-Chris

about a week ago
top

What Will Google Glass 2.0 Need To Actually Succeed?

rockmuelle No Camera? (324 comments)

How about just remove the camera? That's the creepiest part of Google Glass.

I'm all for exploring the potential of having a display in my line of site for getting information on demand or for AR applications. You don't need a camera for either of those. For AR, the GPS in the phone gives you position, accelerometers in the headset give you orientation, and public database of roads and buildings gives the apps spatial awareness. If you want to be able to highlight people or cars, they could 'opt in' to a location sharing feature that publishes their coordinates.

Battery life would probably be much better w/o the camera as well.

-Chris

about a week ago
top

US Senate Set To Vote On Whether Climate Change Is a Hoax

rockmuelle Re:Yep it is a scam (666 comments)

31,000 extra deaths due to cold weather and the flu in 2013:

http://www.dailymail.co.uk/new...

584,000 deaths due to malaria in the same year:

http://www.who.int/features/fa...

Malaria is transmitted by mosquitoes, which rely on warm weather to live. And that's just one warm weather related cause of death that will go up as the planet warms. :. A warming planet will be a deadlier planet than a cooling planet.

about a week ago
top

Justified: Visual Basic Over Python For an Intro To Programming

rockmuelle Re:Proprietary (647 comments)

Look, I'm a huge Python and open source advocate and use it for almost everything I do, but the "proprietary" argument doesn't hold any water. VB, and Microsoft's languages in general, have seen more long term support than any open source language. They have consistently had a level of commitment to backwards compatibility and long term support that no open source language implementation can match. Sure, with an open source language you can fix problems yourself*, but if there's good support from the vendor, as is the case with MS, you never need to.

You're going to need to give a much better reason than "proprietary" to discount the VB argument. There are lots of good ones, but this isn't one.

-Chris.

*though I'd argue that there are only a few of us out there with the chops to actually do that

about a week ago
top

Ask Slashdot: Linux Database GUI Application Development?

rockmuelle HTML5 Client (264 comments)

Have you considered a Web client? HTML5 + JavaScript + [your favorite server language and ORM]* is a good development stack. It also has the benefit of zero-install for your clients.

We develop complex scientific software and made the decision to go HTML/JS for all our client code a few years ago and haven't regretted it. It takes a little bit of learning the libraries, but there are some good mature ones available to make streamline development.

-Chris

*I've used Django and Tornado+SQLAlchemy extensively for this.

about two weeks ago
top

Meet Flink, the Apache Software Foundation's Newest Top-Level Project

rockmuelle Re:Hadoop needs a fairly specialized problem (34 comments)

MPI is definitely for very specific problems and really isn't what I'd consider "conventional" cluster programming. Most people associate MPI with clusters and parallel computing, but if you look at what's actually running on most big clusters, it's almost always just batch jobs (or batch jobs implemented using MPI :) ).

Interestingly, all my examples were on genomics problems (processing SOLiD and ION Torrent runs). We started going down the Hadoop path because we thought it'd be more accessible to the bioinformaticians. But, once we saw the performance differences (and, importantly, understood the source of them) we abandoned it pretty quickly for more appropriate hardware designs (fast disks, fat pipes, lots of RAM, and a few linux tuning tricks -- swappiness=0 is your friend). Incidentally, GATK suffered from these same core performance problems. The original claims that the map-reduce framework would make GATK fast were never actually tested, just simply claimed in the paper. GATK's performance was always been orders of magnitude less than the same algorithms implemented without map-reduce. But, it's from the Broad, so it must be perfect. ;)

I like sector and sphere. We also did a POC with them and they performed much better than the alternatives. Unfortunately, they also required very good programmers to use effectively.

Good stuff!

-Chris

about two weeks ago
top

Meet Flink, the Apache Software Foundation's Newest Top-Level Project

rockmuelle Re:Ok, I give up (34 comments)

More importantly, why did we need Hadoop when we already had [your_favorite_language] + [your_favorite_job_scheduler] + [your_favorite_parallel_file_system]?

Seriously, standard HPC batch processing methods are always faster and easier to develop for than latest_trendy_distributed_framework.

The challenges of data at scale* are almost all related to IO performance and the overhead of accessing individual records.

IO performance is solved by understanding your memory hierarchy and designing your hardware and tuning your file system around your common access patterns. A good multi-processor box with a fast hardware raid and decent disks and sufficient RAM will outperform a cheap cluster any day of the week and likely cost less (it's 2015, things have improved since the days of Beowulf). If you need to scale, a small cluster with Infiniband (or 10 GigE) interconnects and Lustre (or GPFS if you have deep pockets) will scale to support a few petabytes of data at 3-4 GB/s throughput (yes, bytes, not bits). You'd be surprised what the right 4 node cluster can accomplish.

On the data access side, once the hardware is in place, record access times are improved by minimizing the abstraction penalty for accessing individual records. As an example, accessing a single record in Hadoop generates a call stack of over 20 methods from the framework alone. That's a constant multiplier of 20x on _every_ data access**. A simple Python/Perl/JS/Ruby script reading records from the disk has a much smaller call stack and no framework overhead. I've done experiments on many MapReduce "algorithms" and always find that removing the overhead of Hadoop (using the same hardware/file system) improves performance by 15-20x (yes, that's 'x', not '%'). Not surprisingly, the non-Hadoop code is also easier to understand and maintain.

tl;tr: Pick the right hardware and understand your data access patterns and you don't need complex frameworks.

Next week: why databases, when used correctly, are also much better solutions for big data than latest_trendy_framework. ;)

-Chris

*also: very few people really have data that's big enough to warrant a distributed solution, but let's pretend everyone's data is huge and requires a cluster.

** it also assumes the data was on the local disk and not delivered over the network, at which point, all performance bets are off.

about two weeks ago
top

Machine Learning Reveals Genetic Controls

rockmuelle Re:cis and mi regulation is not "bad" code (14 comments)

For small genomes, yes, but for large genomes, there is a lot of "unused" material.

Only about 6-10% of the human genome is transcribed into RNA, either protein the coding kind or non-coding types used in regulation. (small genomes are almost always entirely coding and even include overlapping coding regions, large genomes are the ones that have "junk" DNA in them)

Transcription is most closely related to a processor reading machine code and doing something with it. In a computer program, we know that we can safely remove dead code paths and the code will still function. This is not true for DNA. Remove a portion of someone's genome and they usually die.

It's much more likely that the "junk"/"noise" regions of the genome are structural and help the DNA coform so the chromosomes can specialize for different functions. DNA folds differently depending on the cell type in multicellular organisms. Because the nucleus of a cell is a fairly crowded place, the way the DNA folds determines which sites on it are even accessible for transcription. Muscle cells expose one set of gene coding regions, fat cells expose another.

Taken from this perspective, large genomes are more akin to an origami fortune teller than machine code. Depending on the series of folding/unfolding events, a specific fortune is revealed. The fortunes are encoded directly onto the paper, but the paper also forms the structure used to access the fortunes. Another actor reads the instructions and acts on them (a person in the origami case or polymerase for DNA).

about a month ago
top

Intel Processor Could Be In Next-Gen Google Glass

rockmuelle Re:Not in the hospital I work at. (73 comments)

I do similar systems for genomics. Despite all the hype around cloud services in our space, we're finding more interest in local copies of the standard databases with links out to the canonical sources as needed. The local copies keep hospital IT happy and ensure access if the network is wonky.

And, it turns out that most clinicians are comfortable sorting through database records on their own and don't like magic algorithms attempting to do it for them. Access to the basic data is what they want.

-Chris

about 2 months ago
top

Comet Probe Philae Unanchored But Stable — And Sending Back Images

rockmuelle Super Mario Galaxy! (132 comments)

Those pictures are amazing! I immediately was taken back to playing Super Mario Galaxy and imagined Mario running around the comet.

-Chris

about 2 months ago
top

Study: Body Weight Heavily Influenced By Heritable Gut Microbes

rockmuelle Re:Oh no (297 comments)

The first few weeks of any training program typically suck. That's where willpower (or encouragement if you're in a group) plays such an important role.

Once I'm passed the initial hump, I always feel the "addictive" need to get more exercise and chase the high. In my specific case, the "high" comes after sustained exertion in the med/high effort range. I rarely see it biking (I'm a bike commuter and never really push myself). But running, climbing, mountaineering, and snowboarding all bring it out. For running, on long runs at a moderate pace it kicks in around mile 5 or 6. For short, faster runs, it kicks in about 30 minutes after the run and lasts for a few hours. Other sports have similar patterns. In my experience, the feeling is most similar to hydrocodone (which, unfortunately, I also know about from running).

Wikipedia's description of the "runner's high" covers some of the suspected mechanisms for it.

-Chris

about 3 months ago
top

There's No Such Thing As a General-Purpose Processor

rockmuelle Programming complexity (181 comments)

A big reason we accept the trade offs of modern processors is that it's generally easy to program a broad range of applications for them.

In the mid aughts (not very long ago, actually), there was a big push for heterogeneous multi-core processors and systems in the HPC space. Roadrunner at Los Alamos was a culmination of this effort (one of the first petascale systems). It was mix of processor types including IBMs Cell (itself a heterogeneous chip). Programming Roadrunner was a bitch. In having different processor families, you had to decompose your algorithm to target the right processor for a given task. Then you had to worry about moving data efficiently between different processors.

This type of development is fun as an intellectual exercise, but very difficult and time consuming in practice. It's also something compilers will never be good at, requiring experts in the architectures, domains, and applications to effectively use the system.

Another lesson from the period (and one that anyone whose done asics has known for years) is that general purpose hardware generally evolves fast enough to catch up with specialized hardware with a reasonable timeframe (usually 6-18 months, see DE Shaw's ASIC for protein folding as an example).

While custom processors are cool (I love hacking on them), they're rarely practical.

-Chris

about 3 months ago
top

Codecademy's ReSkillUSA: Gestation Period For New Developers Is 3 Months

rockmuelle One semester is a start (173 comments)

So, there is some truth to the 3 month number. I learned C from a minimal programming background in one semester as an undergrad, or about three months. Of course, 20 years later I'm still refining my skills. The rest of the CS degree gave me a much more solid foundation than I'd have if I had gone straight to work after learning C. Surprisingly, basic theory like complexity analysis come in handy when building applications.

-Chris

about 3 months ago
top

Back To Faxes: Doctors Can't Exchange Digital Medical Records

rockmuelle Re:Bruce Perens (240 comments)

Open Standards and Protocols are what this space needs, along with regulations requiring vendors to allow interoperability for free or a nominal fee.

Open Source software, on the other hand, won't really solve any problems. Someone has to write the software and vet it. EHR software isn't an itch people typically want to scratch. Of course, an EHR platform could leverage Open Source software for development. A Web-based EHR could use an entire Open Source stack and even contribute libraries for protocol support.

Open Source is great for infrastructure components, not so great for user-facing applications. At some level in the stack, someone needs to do the UX work, testing, and validation to create an application people can actually use.

I would never advocate for a fully Open Source solution for EHRs or any other complex, user-facing software, but I would put incentives in place to leverage as much Open Source in the stack as possible. Plus, any company that does that right will have much cheaper dev costs and will be able to undercut the competition a bit (though for supported software, dev costs are usually only 10-20% of the costs, with support, marketing, sales, etc taking up the bulk of the costs).

-Chris

about 4 months ago
top

Back To Faxes: Doctors Can't Exchange Digital Medical Records

rockmuelle Re:sounds like a job for (240 comments)

Um, Google tried the whole GoogleHealth thing a few years back and gave up: http://en.wikipedia.org/wiki/G...

This is not an easy space to play in. Hospitals and doctors are slow to change. Once an investment has been made in a particular platform it's very difficult to replace it.

-Chris

about 4 months ago
top

Ask Slashdot: Any Place For Liberal Arts Degrees In Tech?

rockmuelle If High School is sufficient for CS, then why not? (392 comments)

The question is interesting in relation to the current bias against four year degrees for software developers in some circles. If, as Peter Thiel claims, you don't need a degree, then it shouldn't matter what your degree is if you get one. So, from that perspective, a tech degree or a liberal arts degree shouldn't make a difference. If a liberal arts degree makes for a more intellectually well rounded person, then it could be argued that that's the better degree for tech.

Of course, I don't buy Peter's argument at all. A good CS degree teaches foundational methods that can be applied throughout a career. Don't get me started on the number of times basic complexity theory or knowledge of the full memory hierarchy has helped improve performance of web pages. Most hobbyists don't have those skills and write them off as just academic oddities. A good CS degree also exposes you to a range of technologies and methods for developing software (no, CS is not just math, no more than physics is just theoretical physics). It gives you an environment where you can develop your skills and gain exposure to the breadth of topics in the field. It's a Good Thing(tm).

Should all programmers have CS degrees? Of course not, but those that do are always going to have an edge over most of the other ones (there are always exceptions - I know a few great developers without degrees).

-Chris

about 4 months ago
top

Oculus Rift CEO Says Classrooms of the Future Will Be In VR Goggles

rockmuelle Re:Some classes would be AWESOME! (182 comments)

VR simulations are only as good as our ability to model and simulate the things we're studying. Physics, maybe. Chemistry and Biology, no way. The latter two are messy and don't lend themselves to simulation expect in a few very specific situations. If it's simply for information retrieval and watching videos, a book or screen is sufficient.

I've spent a lot of time with various 3D emersion technologies and scientific applications (old-school VR, Caves, polarized googles, etc) and the reality is that they don't add much. Don't get me wrong, they make GREAT demos. I love playing with the technology. But, spend any amount of time doing real work with them and their limitations quickly become apparent. It's not that the technology doesn't work, it's that most content doesn't really lend itself to the medium and for content that does, getting the user experience right is a difficult and expensive task.

-Chris

about 4 months ago
top

3 Recent Flights Make Unscheduled Landings, After Disputes Over Knee Room

rockmuelle Re:Wait a minute, a few years ago I recall and AA (819 comments)

And that is how our current implementation of the free market actually works. No business action is made for the customer's benefit. It's always about making one more dollar off a captive customer base and pretending you're doing them a favor. America needs to return to stakeholder capitalism rather than the current shareholder model (yes, there actually are different models for market-based economies).

about 5 months ago
top

C++14 Is Set In Stone

rockmuelle Re:What about (193 comments)

Yes!!! I wish I had mod points. They basically had them ready to go for C++11 and then committee infighting killed them (Bjarne stubbornly backed the wrong horse - not that I have a strong opinion on this or anything ;) ).

Syntactic support for generic programming would be the single best addition to C++ to breathe new life into the language and get a whole generation of developers who've written it off interested in it. Generic programming is as paradigm shifting as OOP. It just kills me that it's so thoroughly obfuscated by template meta-programming in C++.

about 5 months ago

Submissions

top

Just what is 'Big Data'?

rockmuelle rockmuelle writes  |  more than 2 years ago

rockmuelle (575982) writes "I work in a 'Big Data' space (genome sequencing) and routinely operate on tera-scale data sets in a high-performance computing environment (high-memory (64-200GB) nodes, 10 GigE/IB networks, peta-scale high-performance stroage systems). However, the more people I chat with professionaly on the topic, the more I realize everyone has a different definition of what consitutites big data and what the best solutions for working with large data are. If you term yourself a 'big data' user, what do you consider 'big data'? Do you measure data in mega, giga, tera, peta-bytes? What is a typical data set you work with? What are the main algorithms you use for analysis? What turn-around times are typical for analyses? What infrastructure software do you use? What system achitectures work best for your problem (and which have you tried that don't work well?)?"
top

CorePy - Assembly Programming with Python

rockmuelle rockmuelle writes  |  more than 6 years ago

rockmuelle writes "We are pleased to announce the latest release of CorePy, now with full support for x86 processors (32 and 64-bit) and an Open Source license. CorePy is a Python package for developing assembly-level applications on x86, Cell BE and PowerPC processors. Its simple APIs enable the creation of complex, high-performance applications that take advantage of advanced processor features usually inaccessible from high-level scripting languages, including multiple cores and vector instruction sets (SSE, VMX, SPU). Based on an advanced run-time system, CorePy lets developers build and execute assembly-level programs interactively from the Python command prompt or embed them directly in Python applications. CorePy is available under a standard BSD license."
Link to Original Source

Journals

rockmuelle has no journal entries.

Slashdot Login

Need an Account?

Forgot your password?