in reply to Re: Table shuffling challenge
in thread Table shuffling challenge

Essentially, I need 1,000,000 different tables. They would be related in that if column 1 in the current table has 5 "1"s and 5 "0"s, each version of column 1 in every table would also have 5 "1"s and 5 "0"s but in a different order each time. The same would be true for all 10 columns.

I apologize if I am not explaining this well. My lack of coding lingo and an inability to communicate without hand gestures is doing me a disservice. If the relevance helps you understand, then:

Each column is a different cancerous tissue sample. The rows are different biomarkers. A "1" means that tumor has that biomarker, a "0" means it does not. We have several biomarkers that are in all 10 samples and we want to know if this is statistically significant. To do this we need to mix up all the 1s and 0s from each tumor and see, at random, how many times you would get a 1 in every column (ie a row value of 10).

Replies are listed 'Best First'.
Re^3: Table shuffling challenge
by RichardK (Parson) on Aug 24, 2013 at 13:04 UTC

    Really, to me the fact that you're finding it difficult to explain your problem is a real red flag.

    Computers are as dumb as a very dumb thing, so if you can't explain your problem to other human beings how can you expect to successfully write a program? The program will only do exactly what you tell it to do, so you need to do all the critical thinking.

    Also, how will you test the your program to understand if it's producing significant results or just random junk? Just because it doesn't crash doesn't mean that it's working properly.

    Being able to clearly explain your problem is hugely important, as demonstrated by the well known debugging technique : CardboardProgrammer Rubber Duck Debugging.

    Perhaps you should start by working with a very small dataset until you've got a good understanding of your problem space.

    You seem to be trying to derive some sort of probabilities, so perhaps there's a way to calculate what you need rather than trying this brute force approach. So maybe some time reading a statistics textbook would be time well spent?

      I used a much smaller dataset to create and troubleshoot the code I currently have. It does what I want it to do...just much more slowly than I would like. With a pen and paper I can very quickly and succinctly explain my problem, it's just difficult to explain without a visual aid. I do agree that I need to work on my communication skills concerning code; I have only taken one self-taught course and it is an area in which I need to improve.
Re^3: Table shuffling challenge
by poj (Abbot) on Aug 23, 2013 at 22:11 UTC
    In the quest for speed I've written this code in a way I wouldn't normally but hopefully it reflects your requirement. A 100 iterations takes about a minute on my desktop so the million would take 170 hours !! - I'll work on speeding it up. poj