Community

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
CraigOgden
Community Member

Regular Expressions.

Jump to solution

I am trying to write a regular expression that matches factors of a polynomial. For example:

(x-5)(x+2)(x-1)

Of course these three factors can be in any order but cannot have duplicates.

This is what I tried:

^\(x-5\)|\(x\+2\)|\(x-1\)$

But that doesn't allow for any order. I also need this to work with even more factors.

Any ideas?

1 Solution
James
Community Champion

 @CraigOgden ,

I'll answer your immediate question toward the end, but I want to preface it with the futility of doing this.

My advice has long been to resist the temptation to automatically grade mathematical content with Canvas and look for alternative methods of asking the question or alternative places to ask it.

No one wants to hear that advice, but asking the computer to automatically grade algebra is like asking it to grade a paper written in a foreign language. Canvas doesn't have the skills to do that. A student might type "(x-5)" or "(x - 5)". They might include a * for a multiplication operator or not. They might use ^0.5 or ^(1/2) for a square root, but they're just as likely to use sqrt() or ^1/2. They could use x*x, xx, x^2, or x², where the ² is the unicode character for squared.

It's easy for math teachers to look at something and tell if the student is doing it right. It's not so easy for a computer to assess the mathematical correctness. Yet math teachers insist on wanting a generalized program like Canvas to assess mathematical correctness.

There is no way to cover all of the possibilities when it is anything more than a simple case. You need a specialized editor that allows people to enter mathematical content and validate it and know how to operate with it. Canvas doesn't have that. And if Canvas added that, there would be people who wanted Canvas to not automatically validate it because they want to know that students can enter valid math. This is why programs like WebAssign have dedicated editors that Canvas doesn't have.

One work-around for this type of problem is to ask which of the following is not a factor of x³-4x²-7x+10. Yet that doesn't completely get at it because they could use something like the rational root theorem to eliminate x-3 because x=3 isn't a factor of 10, but it wouldn't help them discern x-2 from x+2. For that, they could use Descarte's Rule of Signs and take combinations of factors and run through the possibilities. It all depends on what you're really after.

Now, as to the case of regular expressions, I will say that I'm not using Quizzes.Next, so this may not work. There's a page in the Community that says Canvas uses Ruby regular expressions, which are mostly equivalent to PCRE, so I'll work the PCRE variety.

What I do is pull up an online regular expression tester. I tend to like Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript . I made sure the flags (at the end of the first line) were g (global) and m (multiline). Neither of those is required for a single expression, but I want to put a bunch of lines down for testing, so I need it to match all of them (g) and the ^ and $ to match the beginning of the line (m).

To illustrate the complicated mess of trying to use regular expressions, here is one that I used for your problem. I used [+] instead of \+, but it's a personal preference. The ?: near the beginning of line 1 below is a non-capturing group, but you could leave it out as shown in line 2 (just to be clear, you only need one of these).

/^(?:\(x-5\)|\(x[+]2\)|\(x-1\)){3}$/‍
/^(\(x-5\)|\(x[+]2\)|\(x-1\)){3}$/‍‍‍

This matches exactly 3 occurrences of (x-5), (x+2), and (x-1)

That means that it matches all of these

(x-5)(x+2)(x-1)
(x-5)(x-1)(x+2)
(x+2)(x-5)(x-1)
(x+2)(x-1)(x-5)
(x-1)(x-5)(x+2)
(x-1)(x+2)(x-5)

Unfortunately, it also matches lines like this (which are incorrect)

(x-1)(x-1)(x-1)
(x-1)(x-1)(x+2)
(x-1)(x-1)(x-5)‍‍‍‍‍‍‍

If you want to allow for spaces and * operators in there, you could do something like this. I don't use Quizzes.Next, but I think the legacy version stripped leading spaces, so the ^\s* at the beginning may be okay with just ^

/^\s*(?:\(\s*x\s*-\s*5\s*\)[\s*]*|\(\s*x\s*[+]\s*2\s*\)[\s*]*|\(\s*x\s*-\s*1\s*\)[\s*]*){3}$/‍

Then it would match the above, plus crazy stuff like this, some of which is incorrect

(x +2)*( x - 5 )(    x - 1 )
(x +2)*( x - 5 )*(    x - 1 )
(x +2)*( x - 5 )*(    x - 1 )*
(x +2)*( x - 5 )**(    x - 1 )*‍‍‍‍

Let's say that you don't care about spaces or times signs, you'll put in the instructions to not use them or the students will miss the problem. But you do care about students not using (x-1)(x-1)(x-1).

This enters the negative lookahead portion of regular expressions. It is basically telling you to match something not followed by something else. It uses (?! regex ).

/^(?:\(x-5\)(?!\(x-5\))|\(x[+]2\)(?!\(x[+]2\))|\(x-1\)(?!\(x-1\))){3}$/‍

This says find an (x-5) not followed by another (x-5), a (x+2) not followed by an (x+2) and an (x-1) not followed by an (x-1) and do that 3 times.

This sounds promising, but it blocks the first line below as desired, but it allows the second line (failure)

(x-5)(x-5)(x-1)
(x-5)(x+2)(x-5)
‍‍

Thankfully, you can work around that by saying to be greedy and allow anything else in between them using the .* regular expression.

/^(?:\(x-5\)(?!.*\(x-5\))|\(x[+]2\)(?!.*\(x[+]2\))|\(x-1\)(?!.*\(x-1\))){3}$/‍

Now I've got something that allows for (x-5)(x+2)(x-1) in any order and not allowing repetition of any of them.

You can extend that pattern as many times as you like. If there is a 4th factor, then use {4} at the end. It will potentially get messy if there are repeated factors; you'll have to make sure they write it in with an exponent otherwise you'll run into issues with the number of factors.

If you want to allow asterisks, then add a [*]* to each, but realize it will then accept a * at the end as well, which is not desirable. I didn't handle the case where people want to write the implied leading coefficient of 1 as in (1x-5) or 1(x-5). Those are mathematically correct, but a regular expression wouldn't know about that.

View solution in original post

2 Replies
James
Community Champion

 @CraigOgden ,

I'll answer your immediate question toward the end, but I want to preface it with the futility of doing this.

My advice has long been to resist the temptation to automatically grade mathematical content with Canvas and look for alternative methods of asking the question or alternative places to ask it.

No one wants to hear that advice, but asking the computer to automatically grade algebra is like asking it to grade a paper written in a foreign language. Canvas doesn't have the skills to do that. A student might type "(x-5)" or "(x - 5)". They might include a * for a multiplication operator or not. They might use ^0.5 or ^(1/2) for a square root, but they're just as likely to use sqrt() or ^1/2. They could use x*x, xx, x^2, or x², where the ² is the unicode character for squared.

It's easy for math teachers to look at something and tell if the student is doing it right. It's not so easy for a computer to assess the mathematical correctness. Yet math teachers insist on wanting a generalized program like Canvas to assess mathematical correctness.

There is no way to cover all of the possibilities when it is anything more than a simple case. You need a specialized editor that allows people to enter mathematical content and validate it and know how to operate with it. Canvas doesn't have that. And if Canvas added that, there would be people who wanted Canvas to not automatically validate it because they want to know that students can enter valid math. This is why programs like WebAssign have dedicated editors that Canvas doesn't have.

One work-around for this type of problem is to ask which of the following is not a factor of x³-4x²-7x+10. Yet that doesn't completely get at it because they could use something like the rational root theorem to eliminate x-3 because x=3 isn't a factor of 10, but it wouldn't help them discern x-2 from x+2. For that, they could use Descarte's Rule of Signs and take combinations of factors and run through the possibilities. It all depends on what you're really after.

Now, as to the case of regular expressions, I will say that I'm not using Quizzes.Next, so this may not work. There's a page in the Community that says Canvas uses Ruby regular expressions, which are mostly equivalent to PCRE, so I'll work the PCRE variety.

What I do is pull up an online regular expression tester. I tend to like Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript . I made sure the flags (at the end of the first line) were g (global) and m (multiline). Neither of those is required for a single expression, but I want to put a bunch of lines down for testing, so I need it to match all of them (g) and the ^ and $ to match the beginning of the line (m).

To illustrate the complicated mess of trying to use regular expressions, here is one that I used for your problem. I used [+] instead of \+, but it's a personal preference. The ?: near the beginning of line 1 below is a non-capturing group, but you could leave it out as shown in line 2 (just to be clear, you only need one of these).

/^(?:\(x-5\)|\(x[+]2\)|\(x-1\)){3}$/‍
/^(\(x-5\)|\(x[+]2\)|\(x-1\)){3}$/‍‍‍

This matches exactly 3 occurrences of (x-5), (x+2), and (x-1)

That means that it matches all of these

(x-5)(x+2)(x-1)
(x-5)(x-1)(x+2)
(x+2)(x-5)(x-1)
(x+2)(x-1)(x-5)
(x-1)(x-5)(x+2)
(x-1)(x+2)(x-5)

Unfortunately, it also matches lines like this (which are incorrect)

(x-1)(x-1)(x-1)
(x-1)(x-1)(x+2)
(x-1)(x-1)(x-5)‍‍‍‍‍‍‍

If you want to allow for spaces and * operators in there, you could do something like this. I don't use Quizzes.Next, but I think the legacy version stripped leading spaces, so the ^\s* at the beginning may be okay with just ^

/^\s*(?:\(\s*x\s*-\s*5\s*\)[\s*]*|\(\s*x\s*[+]\s*2\s*\)[\s*]*|\(\s*x\s*-\s*1\s*\)[\s*]*){3}$/‍

Then it would match the above, plus crazy stuff like this, some of which is incorrect

(x +2)*( x - 5 )(    x - 1 )
(x +2)*( x - 5 )*(    x - 1 )
(x +2)*( x - 5 )*(    x - 1 )*
(x +2)*( x - 5 )**(    x - 1 )*‍‍‍‍

Let's say that you don't care about spaces or times signs, you'll put in the instructions to not use them or the students will miss the problem. But you do care about students not using (x-1)(x-1)(x-1).

This enters the negative lookahead portion of regular expressions. It is basically telling you to match something not followed by something else. It uses (?! regex ).

/^(?:\(x-5\)(?!\(x-5\))|\(x[+]2\)(?!\(x[+]2\))|\(x-1\)(?!\(x-1\))){3}$/‍

This says find an (x-5) not followed by another (x-5), a (x+2) not followed by an (x+2) and an (x-1) not followed by an (x-1) and do that 3 times.

This sounds promising, but it blocks the first line below as desired, but it allows the second line (failure)

(x-5)(x-5)(x-1)
(x-5)(x+2)(x-5)
‍‍

Thankfully, you can work around that by saying to be greedy and allow anything else in between them using the .* regular expression.

/^(?:\(x-5\)(?!.*\(x-5\))|\(x[+]2\)(?!.*\(x[+]2\))|\(x-1\)(?!.*\(x-1\))){3}$/‍

Now I've got something that allows for (x-5)(x+2)(x-1) in any order and not allowing repetition of any of them.

You can extend that pattern as many times as you like. If there is a 4th factor, then use {4} at the end. It will potentially get messy if there are repeated factors; you'll have to make sure they write it in with an exponent otherwise you'll run into issues with the number of factors.

If you want to allow asterisks, then add a [*]* to each, but realize it will then accept a * at the end as well, which is not desirable. I didn't handle the case where people want to write the implied leading coefficient of 1 as in (1x-5) or 1(x-5). Those are mathematically correct, but a regular expression wouldn't know about that.

CraigOgden
Community Member

I agree with everything you said. I know it is not fail-proof, but we are

trying to make Canvas work for us.

Thank you so much for helping me out.

On Tue, Jan 1, 2019 at 8:34 PM james@richland.edu <instructure@jiveon.com>