Lesson 5

Quantifiers

Introduction

In the Lesson 2, we learn that the ? after a character makes it optional.

This was an example of a 'quantifier'.

Quantifiers allow us to specify how many of a given character (or characters) must be present in the string.

Lazy or Greedy?

Quantifiers can either be 'greedy' or 'lazy'.

Greedy quantifiers will look for the longest possible match. Lazy quantifiers are the opposite, they look for the shortest match.

This will become clearer as we look at specific quantifiers later.

Adding a ? after a quantifier makes it lazy. For example, the lazy version of the * quantifier is *?.

Don't confuse this with using the ? on its own, which - as we learnt in Lesson 2 - makes a character optional.

Zero-or-more times

The * means, match zero or more times.

This means that the preceding character does not need to occur, but if it does occur then it is captured as part of the match.

Here we capture every ch. If that 'ch' is followed by any 'o' characters, then we'll capture those too (o*):

Try it out!

ch choo chooo choooo chooooo!

Expression:>

Notice that first 'ch' is still captured, even thought it is not followed by an 'o'.

One-or-more times

The + means that the character must occur at least once.

This expression captures the characters 'ch', followed by one or more 'o' characters (o+).

This is the greedy version, which will keep capturing the 'o' characters until there are no more:

Try it out!

ch choo chooo choooo chooooo!

Expression:>

Whereas the lazy version (+?) will find the first matching 'o' and then stop:

Try it out!

ch choo chooo choooo chooooo!

Expression:>

Notice that neither of these match the first 'ch', because it is not followed by the 'o'.

Exactly n-times

To specify the exact number of times a character must match, you can use the {n} quantifier.

For example, /p{5}/ means that the character 'p' must occur exactly '5' times. This would be the same as writing the expression /ppppp/.

Here, we'll select ch, followed by exactly {2} o characters.

Try changing the quantifier and see what happens:

Try it out!

ch choo chooo choooo chooooo!

Expression:>

Between n and m times

Use can use two numbers inside the brackets ({n,m}), to specific the minimum and maximum number of times the character must match.

Here, we look for the ch that are followed by 3 to 5 'o's.

Try it out!

ch choo chooo choooo chooooo!

Expression:>

The expression above is 'greedy', meaning it will capture all of the 'o' characters it can (up to 5).

As always, we can add a ? after this quantifier to make it 'lazy'.

The lazy version of this expression will still only match ch when it is followed by 3 to 5 'o's, but it will only capture the minimum number of 'o' characters it can, 3 in our example:

Try it out!

ch choo chooo choooo chooooo!

Expression:>

At least n-times

If you leave the second number in the brackets empty ({n,}), the expression will have no upper limit for the number of matches.

Here, we capture ch followed by at least 3 'o' characters.

The greedy version will capture all of the 'o's it finds:

Try it out!

ch choo chooo choooo chooooo!

Expression:>

.. but the lazy version will capture the minimum it can - 3 in this case:

Try it out!

ch choo chooo choooo chooooo!

Expression:>

Mini-Game

There was a lot going on in this lesson, so let's recap what we've learnt with a quick game.

Feel free to refer back to the quantifiers above for help.

Modify this expression to select all of the numerical values in this text, including the % symbol where there is one:

The earth is 12742 kilometres in diameter, orbits the sun at 30 kilometres per second, and is about 71% water.


Your expression:

Hints:

  • There are 3 values to select from the text above
  • Those numbers have at least one digit, and are sometimes followed by a % symbol
  • Remember to use the global modifier at the end