Write a function that takes a data set as a param. Some of the records in the data set might have errors. The function returns a data set with no errors.
You've got a data set (a list of dictionaries). Some of the records in the data set might have errors. You want to remove the errors before analysis.
Write a function that takes a data set (a list of dictionaries) as a param, and returns another data set with no errors.
Call it like this:
- cleaned_goat_scores = clean_goat_scores(raw_goat_scores)
Here's an example:
- def clean_goat_scores(raw_goat_scores):
- # Create a new list for the clean records.
- cleaned_goat_scores = []
- # Loop over raw records.
- for raw_record in raw_goat_scores:
- # Is the record OK?
- if is_record_ok(raw_record):
- # Yes, make a new record with the right data types.
- clean_record = {
- 'Goat': raw_record['Goat'],
- 'Before': float(raw_record['Before']),
- 'After': float(raw_record['After'])
- }
- # Add the new record to the clean list.
- cleaned_goat_scores.append(clean_record)
- # Send the cleaned list back.
- return cleaned_goat_scores
Line 5 loops over the records. Line 7 calls is_record_ok
, a function that returns True
if the data is OK, and False
if it isn't.
Here's an example of is_record_ok
:
- def is_record_ok(record):
- # Check name.
- goat_name = record['Goat']
- if goat_name == '' or goat_name == None:
- return False
- # Check Before value.
- before = record['Before']
- if not is_score_ok(before):
- return False
- # Check After value.
- after = record['After']
- if not is_score_ok(after):
- return False
- return True
It goes through each field, returning False
if there's a problem.
Line 8 calls a function that tests a numeric field. Here's an example:
- def is_score_ok(score_in):
- # Is it a number?
- try:
- score_number = float(score_in)
- except Exception:
- return False
- # Check range.
- if score_number < 0 or score_number > 100:
- return False
- # All OK.
- return True