-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What should be done if, given an ILCS, there are multiple matching IUCR codes? #4
Comments
This is related to #3. |
@bepetersn Good catch.
Is this precisely what happens? The In any case, the important observation is that we currently don't try to disambiguate between the multiple IUCR codes and just set it to an empty string. Here's a few thoughts off the top of my heads:
|
#4 is part of an answer to this. The rest of it is that we ultimately don't care about getting IUCRs for every single statute, especially if it's not due to our incompetence, but because of the way statutes and IUCR codes get assigned. Between using charge descriptions, and @ghing's work to roll up IUCR codes to our categories of interest, we will handle multiple IUCR codes for a statute. |
It seems as though our
iucr
package's functionality currently does nothing if, when trying to associate an IUCR code with an ILCS statute reference, it finds more than one code. More specfically, theiucr
package raises an exception, whichstatute.py
of this data project responds to by setting a disposition'siucr_code
field to the empty string. We are currently losing about 30% of our IUCR data just to this, in absolute terms.However, it's really a little bit worse than just 30%. Some statutes are affected disproportionately by this. I am planning on posting a JSON document with all of the statutes for which this happens, along with counts for each. Consider
720-5/19-1(a)
, though. Burglary. There are around 15000 dispositions for which there is no IUCR code because of this issue. This translates into about half as many convictions with no IUCR code.Here are some of the other statutes disproportionately affected by this issue:
In my opinion, there isn't an obvious solution to this problem. The shape of the data varies among statutes, but typically there is at least SOME relationship between the multiple IUCR codes associated with a single statute. So from one perspective, it might not matter that much. The simplest thing I can think to do is to return the first IUCR code associated with a statute. It might be possible to make this slightly more dynamic in the cases where there might be value in doing so. For instance, choosing the most "severe" IUCR code.
Thoughts, @ghing?
The text was updated successfully, but these errors were encountered: