Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some IUCR codes and categories in the latest upload of the DB are out of sync with the crosswalk #14

Closed
bepetersn opened this issue Sep 2, 2014 · 1 comment

Comments

@bepetersn
Copy link
Member

While going very carefully over the data in our database, I found that there is some mysterious data whose origin I can't understand. Not too much, but a little. For example, searching by the charge description HARASSMENT BY TELEPHONE...

cs = Conviction.objects.filter(final_chrgdesc='HARASSMENT BY TELEPHONE')
for c in cs:  
    print(c.case_number, c.final_statute, c.iucr_code, c.iucr_category)  

The output is:

2005CR2695101 720 135/1-(2)  
2007C22021901 720-135/1-1  
2007C66174901 720-135/1-1 3800 Interference With Public Officers  
2009CR0650001 720-135/1-1 3960 Intimidation  

Fine, right? Except that doing an equivalent search over the IUCR crosswalk yields totally different IUCR codes and categories.

from convictions_data.models import Conviction
from convictions_data.statute import get_iucr
cs = Conviction.objects.filter(final_chargdesc='HARASSMENT BY TELEPHONE')
for c in cs:
    try:
        offenses = get_iucr(c.final_statute)
    except Exception:
        continue
    for o in offenses:
        print o.code, o.offense_category

The output is:

2820 Disorderly Conduct  
2825 Disorderly Conduct  
2820 Disorderly Conduct  
2825 Disorderly Conduct  
2820 Disorderly Conduct  
2825 Disorderly Conduct  

This doesn't seem possible, at least to my current understanding of how we generated our IUCR categories and codes. My understanding is that the statute2iucr management command was originally run to generate codes and categories, which itself relies on just the same method I used, get_iucr(), and the crosswalk behind the scenes. However, as you can see, the crosswalk doesn't have these values.

The reason this is worth pointing out is that, at least by the crosswalk, it would seem we can map from the charge description HARASSMENT BY TELEPHONE to the IUCR category Disorderly Conduct reliably, which is part of the task of #6.

Perhaps the statute2iucr command simply needs to be run again. In the meantime, it's easy to just work around this by doing a double check in the way I did above, that get_iucr() also contains an IUCR category, given a statute.

@ghing
Copy link
Contributor

ghing commented Sep 2, 2014

@bepetersn, I took a quick look at your example and I suspect that your intuition is right, these records were probably created with a less good version of our ILCS parsing code. Rerunning statute2iucr should fix this. We would also have to recreate the convictions from the disposition records, but that's no big deal. Thanks for looking into this.

@ghing ghing closed this as completed Sep 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants