Merge pull request #49 from mitodl/variable_names_with_numbers

Introduces sampling for numbered variables
mitodl · Jul 1, 2018 · 26d88e0 · 26d88e0
2 parents ef4a1ef + 0b84ec7
commit 26d88e0
Show file tree

Hide file tree

Showing 10 changed files with 182 additions and 17 deletions.
diff --git a/README.md b/README.md
@@ -5,9 +5,9 @@
 
 A library of graders for edX Custom Response problems.
 
-Version 1.0.4 ([changelog](changelog.md))
+Version 1.1.0 ([changelog](changelog.md))
 
-Copyright 2017 Jolyon Bloomfield and Chris Chudzicki
+Copyright 2017-2018 Jolyon Bloomfield and Chris Chudzicki
 
 We thank the MIT Office of Digital Learning for their support.
 
@@ -17,6 +17,7 @@ We thank the MIT Office of Digital Learning for their support.
 - [Local Installation](#local-installation)
 - [Usage in edX](#usage-in-edx)
 - [Grader Documentation](#grader-documentation)
+- [FAQ](#FAQ)
 
 
 ## Demo Course
@@ -100,7 +101,7 @@ Here, the `correct_answer` entries are shown to students when they click "Show A
 [Extensive documentation](docs/README.md) has been compiled for the configuration of the different graders in the library.
 
 
-## FAQs
+## FAQ
 
 * After installing a virtual environment and doing `pip install`, `pytest` returns a number of errors for `no module named error`.
 

diff --git a/changelog.md b/changelog.md
@@ -1,9 +1,13 @@
 # Change Log
 
+## Version 1.1.0
+* Added numbered variables to FormulaGrader
+* Removed case-insensitive comparisons from FormulaGrader and IntegralGrader. *WARNING:* This breaks backwards compatibility, and is a departure from edX. However, we believe that students should know that `M` and `m` are different variables, and removing case-insensitive comparison fixes a number of ambiguous situations.
+
 ## Version 1.0.5
-* improved debugging information for FormulaGrader
+* Improved debugging information for FormulaGrader
 * FormulaGrader and IntegralGrader perform whitelist, blacklist, and forbidden_string checks after determining answer correctness. Incorrect answers using forbidden strings / functions are now marked incorrect, while correct answers using forbidden strings / functions raise errors.
-* minor improvements to existing unit tests
+* Minor improvements to existing unit tests
 
 ## Version 1.0.4
 * Authors can now specify a custom comparer function for FormulaGrader

diff --git a/course/course/course.xml b/course/course/course.xml
@@ -23,6 +23,7 @@
         <problem url_name="formula6"/>
         <problem url_name="formula7"/>
         <problem url_name="formula8"/>
+        <problem url_name="formula9"/>
       </vertical>
     </sequential>
     <sequential url_name="singlelistgrader" display_name="SingleListGrader">

diff --git a/course/problem/formula9.xml b/course/problem/formula9.xml
@@ -0,0 +1,21 @@
+<problem display_name="Numbered Variables" showanswer="always" weight="10" attempts="">
+
+  <p>If you have a system that contains a large or infinite number of numbered coefficients such as \(a_1\), \(a_2\), etc, it can be a pain to initialize all of these variables as independent variables that the grader should accept. Numbered variables allows you to specify that "a" is a numbered variable, and the system will then accept any entry of the form "a_{##}" where ## is an integer.</p>
+
+  <p>The answer to the problem below is "a_{0} + a_{1} + a_{-1}". Try including "a_{42}" in your expression -- the grader should be happy to parse your expression and grade you appropriately.</p>
+
+<script type="text/python" system_path="python_lib">
+from mitxgraders import FormulaGrader
+grader = FormulaGrader(
+    answers='a_{0} + a_{1} + a_{-1}',
+    numbered_vars=['a']
+)
+</script>
+
+<customresponse cfn="grader" inline="1">
+  <textline math="1" inline="1" correct_answer="a_{0} + a_{1} + a_{-1}"/>
+</customresponse>
+
+<a href="https://github.com/mitodl/mitx-grading-library/tree/master/course/problem/formula9.xml" target="_blank">View source</a>
+
+</problem>
diff --git a/docs/formula_grader.md b/docs/formula_grader.md
@@ -39,6 +39,25 @@ grader = FormulaGrader(
 The `sample_from` key must be a dictionary of 'variable_name': sampling_set pairs. You can specify a sampling set, a real interval, or a discrete set of values to sample from. The above example shows each of these in order.
 
 
+## Numbered Variables
+
+You can also specify special variables that are numbered. For example, if you specify that `a` is a numbered variable, students can include `a_{0}`, `a_{5}`, `a_{-2}`, etc, using any integer. All entries for a numbered variable will use the sampling set specified by the base name.
+
+````python
+grader = FormulaGrader(
+    answers='a_{0} + a_{1}*x + 1/2*a_{2}*x^2',
+    variables=['x'],
+    numbered_vars=['a'],
+    sample_from={
+        'x': [-5, 5],
+        'a': [-10, 10]
+    }
+)
+````
+
+If you have a variable name that would clash with a numbered variable (say, you defined `a_{0}` and also a numbered variable `a`), then the specific variable has precedence.
+
+
 ## Samples and Failable Evaluations
 
 To control the number of samples that are checked to ensure correctness, you can modify the `samples` key.
@@ -313,6 +332,7 @@ We have made a number of other improvements over the edX formula graders, includ
 * When students use an unknown variable, the resulting error message highlights that the unknown quantity was interpreted as a variable.
 * Similarly, when students use an unknown function, the resulting error message highlights that the unknown quantity was interpreted as a function. If a variable of that name exists, the error message suggests that a multiplication symbol was missing.
 * If an unexpected error occurs, students will see a generic "invalid input" message. To see exactly where things went wrong, set the `debug` flag to True, and a more technical message will usually be displayed.
+* Full sampling details are included when the `debug` flag is set to True.
 
 
 - [Home](README.md)
diff --git a/mitxgraders/formulagrader.py b/mitxgraders/formulagrader.py
@@ -15,7 +15,7 @@
                                   construct_constants, construct_suffixes)
 from mitxgraders.baseclasses import ItemGrader, InvalidInput, ConfigError
 from mitxgraders.voluptuous import Schema, Required, Any, All, Extra, Invalid, Length
-from mitxgraders.helpers.calc import CalcError, evaluator
+from mitxgraders.helpers.calc import CalcError, evaluator, parsercache
 from mitxgraders.helpers.validatorfuncs import (Positive, NonNegative, is_callable,
                                                 PercentageString, is_callable_with_args)
 from mitxgraders.helpers.mathfunc import within_tolerance, DEFAULT_FUNCTIONS
@@ -101,7 +101,7 @@ def validate_only_permitted_functions_used(used_funcs, permitted_functions):
     InvalidInput: Invalid Input: function(s) 'h', 'Sin' not permitted in answer
     """
     used_not_permitted = [f for f in used_funcs if f not in permitted_functions]
-    if used_not_permitted != []:
+    if used_not_permitted:
         func_names = ", ".join(["'{f}'".format(f=f) for f in used_not_permitted])
         message = "Invalid Input: function(s) {} not permitted in answer".format(func_names)
         raise InvalidInput(message)
@@ -174,12 +174,52 @@ def validate_required_functions_used(used_funcs, required_funcs):
     InvalidInput: Invalid Input: Answer must contain the function f
     """
     for func in required_funcs:
-        used_funcs = [f for f in used_funcs]
         if func not in used_funcs:
             msg = "Invalid Input: Answer must contain the function {}"
             raise InvalidInput(msg.format(func))
     return True
 
+def numbered_vars_regexp(numbered_vars):
+    """
+    Creates a regexp to match numbered variables. Catches the full string and the head.
+
+    Arguments:
+        numbered_vars ([str]): a list of variable heads
+
+    Usage
+    =====
+
+    Matches numbered variables:
+    >>> regexp = numbered_vars_regexp(['b', 'c', 'Cat'])
+    >>> regexp.match('b_{12}').groups()
+    ('b_{12}', 'b')
+    >>> regexp.match('b_{-3}').groups()
+    ('b_{-3}', 'b')
+    >>> regexp.match('b_{0}').groups()
+    ('b_{0}', 'b')
+
+    Other variables match, too, in case-sensitive fashion:
+    >>> regexp.match('Cat_{17}').groups()
+    ('Cat_{17}', 'Cat')
+
+    Stuff that shouldn't match does not match:
+    >>> regexp.match('b') == None
+    True
+    >>> regexp.match('b_{05}') == None
+    True
+    >>> regexp.match('b_{-05}') == None
+    True
+    >>> regexp.match('B_{0}') == None
+    True
+    """
+    head_list = '|'.join(map(re.escape, numbered_vars))
+    regexp = (r"^((" + head_list + ")" # Start and match any head (capture full string, head)
+              r"_{" # match _{
+              r"(?:[-]?[1-9]\d*|0)" # match number pattern
+              r"})$") # match closing }, close group, and end of string
+    return re.compile(regexp)
+
+
 class FormulaGrader(ItemGrader):
     """
     Grades mathematical expressions, like edX FormulaResponse. Note that comparison will
@@ -229,6 +269,12 @@ class FormulaGrader(ItemGrader):
 
         variables ([str]): A list of variable names (default [])
 
+        numbered_vars ([str]): A list of numbered variable names, which can only occur
+            with a number attached to the end. For example, ['numvar'] will allow students
+            to write `numvar_{0}`, `numvar_{5}` or `numvar_{-2}`. Any integer will be
+            accepted. Use a sample_from entry for `numvar`. Note that a specifically-named
+            variable will take priority over a numbered variable. (default [])
+
         sample_from (dict): A dictionary of VariableSamplingSets for specific variables. By
             default, each variable samples from RealInterval([1, 5]) (default {}). Will
             also accept a list with two values [a, b] to sample from the real interval
@@ -271,6 +317,7 @@ def schema_config(self):
             Required('metric_suffixes', default=False): bool,
             Required('samples', default=5): Positive(int),
             Required('variables', default=[]): [str],
+            Required('numbered_vars', default=[]): [str],
             Required('sample_from', default={}): dict,
             Required('failable_evals', default=0): NonNegative(int)
         })
@@ -386,7 +433,7 @@ def __init__(self, config=None, **kwargs):
                 Any(VariableSamplingSet,
                     All(list, lambda pair: RealInterval(pair)),
                     lambda tup: DiscreteSet(tup))
-            for varname in self.config['variables']
+            for varname in (self.config['variables'] + self.config['numbered_vars'])
         })
         self.config['sample_from'] = schema_sample_from(self.config['sample_from'])
         # Note that voluptuous ensures that there are no orphaned entries in sample_from
@@ -412,11 +459,52 @@ def check_response(self, answer, student_input):
                 msg = "Invalid Input: Could not parse '{}' as a formula"
                 raise InvalidInput(msg.format(student_input))
 
+    def generate_variable_list(self, answer, student_input):
+        """
+        Generates the list of variables required to perform a comparison and the
+        corresponding sampling dictionary, taking into account any numbered variables.
+
+        Returns variable_list, sample_from_dict
+        """
+        # Pre-parse all expressions (these all get cached)
+        parsers = [
+            parsercache.get_parser(expr, self.suffixes)
+            for expr in answer['expect']['comparer_params']
+            ]
+        # If the student input is not empty, parse that too
+        if not (student_input is None or student_input.strip() == ""):
+            parsers.append(parsercache.get_parser(student_input, self.suffixes))
+        # Create a list of all variables used in the expressions
+        vars_used = set().union(*[parser.variables_used for parser in parsers])
+
+        # Initiate the variables list and sample_from dictionary
+        variable_list = self.config['variables'][:]
+        sample_from_dict = self.config['sample_from'].copy()
+
+        # Find all unassigned variables
+        bad_vars = set(var for var in vars_used if var not in variable_list)
+
+        # Check to see if any unassigned variables are numbered_vars
+        regexp = numbered_vars_regexp(self.config['numbered_vars'])
+        for var in bad_vars:
+            match = regexp.match(var)  # Returns None if no match
+            if match:
+                # This variable is a numbered_variable
+                # Go and add it to variable_list with the appropriate sampler
+                (full_string, head) = match.groups()
+                variable_list.append(full_string)
+                sample_from_dict[full_string] = sample_from_dict[head]
+
+        return variable_list, sample_from_dict
+
     def raw_check(self, answer, student_input):
         """Perform the numerical check of student_input vs answer"""
-        var_samples = gen_symbols_samples(self.config['variables'],
+        # Generate samples
+        variable_list, sample_from_dict = self.generate_variable_list(answer,
+                                                                      student_input)
+        var_samples = gen_symbols_samples(variable_list,
                                           self.config['samples'],
-                                          self.config['sample_from'])
+                                          sample_from_dict)
 
         func_samples = gen_symbols_samples(self.random_funcs.keys(),
                                            self.config['samples'],
@@ -521,6 +609,8 @@ class NumericalGrader(FormulaGrader):
 
         variables ([str]): Will always be an empty list
 
+        numbered_vars ([str]): Will always be an empty list
+
         sample_from (dict): Will always be an empty dictionary
 
         failable_evals (int): Will always be 0
@@ -537,6 +627,7 @@ def schema_config(self):
             Required('tolerance', default='5%'): Any(PercentageString, NonNegative(Number)),
             Required('samples', default=1): 1,
             Required('variables', default=[]): [],
+            Required('numbered_vars', default=[]): [],
             Required('sample_from', default={}): {},
             Required('failable_evals', default=0): 0
         })
diff --git a/mitxgraders/helpers/calc.py b/mitxgraders/helpers/calc.py
@@ -240,7 +240,7 @@ def get_parser(self, formula, suffixes):
 
         # Strip out any whitespace, so that two otherwise-equivalent formulas are treated
         # the same
-        stripformula = "".join([char for char in formula if char != " "])
+        stripformula = formula.replace(" ", "")
 
         # Construct the key
         suffixstr = ""
@@ -444,15 +444,15 @@ def parse_algebra(self):
         #   subscripts (optional):
         #       any combination of alphanumeric and underscores
         #   lower_indices (optional):
-        #       Of form "_{<alaphnumeric>}"
+        #       Of form "_{(-)<alaphnumeric>}"
         #   upper_indices (optional):
-        #       Of form "^{<alaphnumeric>}"
+        #       Of form "^{(-)<alaphnumeric>}"
         #   tail:
         #       any number of primes
         front = Word(alphas, alphanums)
         subscripts = Word(alphanums + '_') + ~FollowedBy('{')
-        lower_indices = Literal("_{") + Word(alphanums) + Literal("}")
-        upper_indices = Literal("^{") + Word(alphanums) + Literal("}")
+        lower_indices = Literal("_{") + Optional("-") + Word(alphanums) + Literal("}")
+        upper_indices = Literal("^{") + Optional("-") + Word(alphanums) + Literal("}")
         tail = ZeroOrMore("'")
         inner_varname = Combine(
             front +

diff --git a/mitxgraders/version.py b/mitxgraders/version.py
@@ -3,4 +3,4 @@
 Version number
 """
 
-__version__ = "1.0.5"
+__version__ = "1.1"
diff --git a/python_lib.zip b/python_lib.zip
diff --git a/tests/test_formulagrader.py b/tests/test_formulagrader.py
@@ -21,6 +21,7 @@
 )
 from mitxgraders.voluptuous import Error, MultipleInvalid
 from mitxgraders.sampling import set_seed
+from mitxgraders.helpers.calc import UndefinedVariable
 from mitxgraders.version import __version__ as VERSION
 from mitxgraders.helpers.calc import UndefinedFunction
 
@@ -648,6 +649,17 @@ def test_docs():
     )
     assert grader(None, '1+x^2+y+z/2')['ok']
 
+    grader = FormulaGrader(
+        answers='a_{0} + a_{1}*x + 1/2*a_{2}*x^2',
+        variables=['x'],
+        numbered_vars=['a'],
+        sample_from={
+            'x': [-5, 5],
+            'a': [-10, 10]
+        }
+    )
+    assert grader(None, 'a_{0} + a_{1}*x + 1/2*a_{2}*x^2')['ok']
+
     grader = FormulaGrader(
         answers='1+x^2',
         variables=['x'],
@@ -803,3 +815,18 @@ def test_errors():
     )
     with raises(CalcError, match="Division by zero occurred. Check your input's denominators."):
         grader(None, '1/0')
+
+def test_numbered_vars():
+    """Test that numbered variables work"""
+    grader = FormulaGrader(
+        answers='a_{0}+a_{1}+a_{-1}',
+        variables=['a_{0}'],
+        numbered_vars=['a'],
+        sample_from={
+            'a_{0}': [-5, 5],
+            'a': [-10, 10]
+        }
+    )
+    assert grader(None, 'a_{0}+a_{1}+a_{-1}')['ok']
+    with raises(UndefinedVariable, match="a not permitted in answer as a variable"):
+        grader(None, 'a')