All,
One of the many tasks of the team I lead at work is making a technical recommendation for changes to the production environment. One of the biggest issues is with "scripts" that are to be used for data gathering running in cron jobs or sysadmin tools. The issue is that these are not part of the production application and haven't necessarily gone through the same rigor. In fact, they aren't even being written by developers in most cases.
To that end, we have decided to offer peer training with a focus on scripts (perl, shell, batch, etc) with the possibility of branching out into other areas later such as improving SQL skills. This is completely voluntary with the direction the classes take based on input from the participants. The first session was a huge success but with with such large varying degree of existing knowledge, we want to be sure to have something to offer everyone. A number of participants have said they don't want to slow down the class and are willing to pick up the basics on their own.
To that end, we decided to offer "homework" assignments which are completely outside the class. These should be simple enough that the beginner's can accomplish them but with enough "extra credit" that the intermediate folks can still find a challenge too. Here is our first example:
Scenario:
You are paying bills and staring at your phone bill trying to figure out why it is so high when the phone rings. It is a telemarketer telling you how much money you could save if switch to their service. You kindly tell the person that you are not interested and, like every good telemarketer, they ask you what is the one number you call the most because they will let you call that number for free. You are not to be dissuaded so you just hang up the phone. As you walk back to your desk still staring at the phone bill you wonder what the answer to the telemarketer’s question really is.
Problem:
Parse a text representation of your phone bill and be able to answer the telemarketer's question. You can view your phone bill online but when you save it as HTML, you realize your perl skills aren’t quite good enough to parse it. It can be saved as a PDF but that’s isn’t any help either. You could print it, scan it using OCR and then convert it from a Word document to a text document but you are lazy and that seems like too much work. You decide the simplest solution is to open a text editor and copy/paste the data.
Input Description:
The text file is tab delimited and looks like:
Number Direction When Duration Cost
(123) 456-7890 Inbound Fri, 7/24/2009, 8:53AM 5 minutes $1
+.47
(456) 789-0123 Outbound Sun, 10/1/2009, 12:48PM 1 minute 34 s
+econds $0.12
Extra Credit:
After discovering the number called most frequently, you realize you have never called that number. You look closer and all the calls happen when you are at work. You begin to suspect that perhaps your significant other has something to hide. You also realize that the phone bill is still higher than you can explain. What other gremlins are hiding. You decide to data mine this phone bill dry. Here are some questions you might want to answer:
- Which numbers are being called while you are at work? (assume 8:00AM - 5:00PM M-F)
- Is anyone calling and hanging up?
- Is there a way to determine it is just a prank caller?
- Is it an automated telemarketer hanging up when the answering machine comes on?
- Is it someone who only hangs up when you answer the phone but other times talks for longer?
- For any given number X, what is the rate to call that number?
- Does time of day affect the cost?
- Is the charge per minute or per second?
- If not per second, how does the phone company round (any partial round up for instance)?
- Are you being over charged for any calls?
- Perhaps your plan charges a fixed rate and you discover a call that exceeds it (optional input)
- Perhaps all calls fall into the same rate category but a few were charged more
- Which number is the most/least
- Called most frequent (same as telemarketer's question)
- Called with longest duration (individual and aggregate)
- Costs the most (per minute as well as total overall cost)
- Received calls with the most frequency
- Received calls with the longest duration (individual call and aggregate total)
- Same questions as above except s/most/least/g and s/longest/shortest/g
- What if you answered the questions above but wanted the top or bottom N calls instead of the extreme?
- How much time is spent on the phone (while you are home vs at work)?
Thos are not the only bumps along the road. You should also make this code as robust and flexible as possible. Have you considered
- What international numbers will look like?
- That not all rows will have data you expect (header/footer for instance)
- That you may need to be able to specify command line parameters instead of re-coding every time you want to ask a different question
- Does your duration conversion routine handle all possible inputs?
- Have you considered what would happen if a call lasted 0 seconds?
We didn't spend a lot of time coming up with this problem but it seems to be a good fit for the beginner and the intermediate. We are hoping you might have some other good ideas?????????
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.