Regular Expressions PHP Help

In the example in the last chapter example-a web form that lets users run SQL against a MySQL database-you did one of the most common things programmers do. You wrote code that solves a problem, but it’s ugly, messy, and a little hard to understand. Unfortunately, most programmers j ‘],e code in that state. That’s something you want to avoid.

Bad code is like sloppy plumbing or a poorly constructed house frame. At some point, things are going to go bad, and someone is going to have to fix problems. And, if you’ve ever had an electrician tell you what he has to charge you because the guy who did the work initially did it wrong uetcre, you know how expensive it is to fix someone else’s mistakes

But here’s the thing: Even good code is going to fail at some point. Any time you have a system that involves humans, at some point, someone will do something unexpected, or maybe just something you never thought about dealing with when you wrote your code. And that’s when V)(, (P the electrician, trying to fix things when the customer’s unhappy-but in this scenario, there’s nobody else to blame.

So, writing ugly code that works really isn’t an option. At the moment, the code in right now is very ugly. It’s all those if statements that are trying to figure out whether the user entered a CREATE or an UPDATE or an INSERT, or maybe a SELECT…or who knows what else? What you really need is a wey to search the incoming query for all those keywords ,. ~t c, ‘;C? And then there’s converting text to uppercase, and dealing with white space, and making sure the SQL keyword you want is at the beginning of the query.

Unfortunately, there’s no way to solve this problem elegantly by using strpos and the string manipulation you’ve done so far. Fortunately, though, you have another option: regular exoresstons. Regular expressions (also know in programmer-ese as (“,uC’~es) are like a keg of gunpowder: extremely powerful, but perfectly capable of blowing up your program and creating hours of frustration. That’s okay, though, because you’re not running off to battle just yet.

Before you’re done with you’ll have learned how to use regular expressions, cut out all but one of those annoying if statements for searching through $query _text, and made your program easier to troubleshoot when problems occur down the line.

String Matching, Double-Time

So far, you’ve been using strpos to perform string searching, and you’ve been passing into that function your string and then some additional characters or a string for which to look. The problem is that using strpos in this way limits you to a single search string at a time; you can search for UPDATE and you can search for DROP, but not at the same time.

Here’s where regular expressions come into the picture. A regular expression is just ., what it sounds like: a regular sequence of characters or numbers or some other pattern- an expression-for which you want to search. If you had a string like “abcdefghijklmnopqrstuvwxyz,” you could search for the pattern, or regular expression, “abc”. It would show up once, of course, which isn’t very “regular.”

However, suppose that you had an entire web page, and you wanted to search for links. You might use an expression like “<a” to find all the link elements. You might find none, or one, or ten; with a regular expression, you can search for practically anything you want. It does get a bit murky though, so the best place to start is at the beginning.

A Simple String Searcher

Just about the simplest regular expression you can come up with is a single simple lett-er, like “a” or “m”, Thus, the regular expression “a” will match any “a”. Simple, right?

In PHP, if you want to search by using regular expressions, you use the preg_match function. Even though that sounds like something related to childbirth, it actually stands for “p-reg,” as in “PHP regular (expressions).” However, no matter how you say it (and what thoughts it conjures up), it’s used like this:

<?php
$string_to_search = “Martin OMC-28U”;
$regex = “/OM/”;
$num_matches = preg_match($regex, $string_to_search);
if ($num_matches > 0) {
echo “Found a match!”;
} else {
echo “No match. Sorry.”;
?>

Admittedly, this isn’t very exciting. Before you can walk, though, you gotta crawl. And part of crawling is understanding just how you write a regular expression.

First, regular expressions are just strings, so you wrap them in quotes. You’ll typically use double quotes (“) rather than single quotes (‘) because PHP doesn’t do as much helpful processing on single-quoted strings as double-quoted ones. (For more advice on how to use quotes in PHP, see the box )

Additionally, regular expressions begin and end with a forward slash. It’s everything between those slashes that makes up the meat of the expression. For example, “jOM/” is a regular expression that searches for OM

Of course, preg_ match has some wrinkles, too. First, as you’ve seen, it takes a regular expression as the first argument. and then the string in which to search as the second. Then, it returns the number of matches, rather than the position ;t which a match was found. Here’s the first real wrinkle: preg_ match will never return It returns a if there are no matches, and 1 upon the first match, and then it simply stops searching.

Which Quote Is the Best Quote?

Almost every programming language seemingly treats single- Quoted strings (,My name is Bob’) and double-Quoted strings (“I am a carpenter.”) the same way. However, also in almost every programming language, there’s a lot more going on than you might realize, all based upon which Quotation mark you use.

In general, there is 1m processing performed on single-Quoted strings. But, what processing occurs in the first place? Take
the statement I’ mgoing to the bank. If you put that in a single-Quoted string, you get ‘I’m going to the bank: But PHP is going to bark at you, because the single-Quote in I’m looks like it’s ending the simple ~tring , I ‘ ,and all the rest-m going to the bank-must just be something else. Of course, that’s not what you mean, so you do one of two things: you either switch to double Quotes and move on, or you v.~·’e the single Quote.’

One last note: in 99 percent of the applications you write, the type of Quotes you use doesn’t matter. The processing involved in handling those extra escape characters and variables isn’t going to frustrate your customers or send server hard drives or RAMchips into a frenzy. You can happily use double-Quoted strings all the time, and you’ll probably never notice any issues at all .

Search for One String … Or Another

So far, there’s not a lot that preg_ match seems to offer that you don’t already have with strpos. But there’s a lot more that you can do, and one of the coolest is searching for one string O( another. To do this, you use a special character called the moe. The pipe looks like a vertical line: I. It’s usually above the backslash character, over
on the right side of your keyboard.

First, though, notice the wrinkle: the backslash (\). This is escaping the period, because that period usually means in a regular expression, “match any single character.” But in this case, you want to match an actual period. So, \. will match a period, and nothing but a period.

IMr\. Smith/ matches “Mr. Smith” but will skip right over “Dr. Smith.” However, I (Mr 1 Dr) \. Smithl matches either “Mr. Smith” or “Dr. Smith.”

That means that this little code snippet would find a match in both cases

(This will match echo “Matches: ” . preg_match(“/(MrIDr)\. Smith/”, “Mr. Smith”);
II So will this echo “Matches: ” . preg_match(“/(MrIDr)\. Smith/” , “Dr. Smith”);

With this new wrinkle, you should be able to make some pretty massive changes to from the last chapter. Open that file and take a look. As a reminder, here’s the old version:

<?php
require’ ..I../scripts/database_connection.php’;
$query_text = $_REOUEST[‘query’];
$result = mysql_query($query_text);
if (!$result) {
die(“<p>Error in executing the SOL query” . $querLtext .
mysql_error() . “</p>”);

$return_rows = false;
$uppercase_query_text = strtoupper($query_text);
$location = strpos($uppercase_query_text, “CREATE”);
if ($location === false)

$location = strpos($uppercase _query_text, “INSERT”);
if ($location === false) (
$location = strpos($uppercase_query_text, “UPDATE”);
if ($location === false) (
$location = strpos($uppercase_query_text, “DELETE”);
if ($location === false) (
Slocation = strpos($uppercase_query_text, “DROP”);
if ($location === false) {
II If we got here, it’s not a CREATE, INSERT, UPDATE,
II DELETE, or DROP query. It should return rows.
$return_rows = true;
if ($return_rows) {
II We have rows to show from the query
– echo “<p>Results from your query:</p>”;
echo “<ul>”;
while ($row = mysql_fetch_row($result»
echo “<li>{$row[o]}</li>”;
}
echo “</ul>”;
} else {
II No rows. Just report if the query ran or not
echo “<p>Your query was processed successfully.</p>”;
echo “<p>{$querLtext}</p>”;

It’s all that if stuff that really is messy. But with regular expressions, you can make some pretty spectacular changes:

<?php
II require and database connection code
$return_rows = true;
if (prelLmatch(“/(CREATEIINSERTIUPDATEIDElETEIDROP)I”,
strtoupper($query_text») {
$return_rows = false;
}
if ($return_rows)
II display code
?>

Take a close look here, especially at the fairly long condition for the if statement. Here’s the breakdown of what’s going on:

Getting into Position

One of the problems with even this streamlined version of ‘ is that it looks for a match anywhere within the input query. If you read the box about white space trimming on page 154, you know there are still problems. You need to trim your user’s query string, which is pretty simple:

This query, a SELECT, returns rows, but if it’s interpreted as an UPDATE or ~ROP, your script will not show return rows.

It took some additional if conditions to get this to work before, but that was before you were taking over the world one regular expression at a time. With regular expressions, it’s easy to tell PHP, “I want this expression, but only at the oeglflninc; of the search string

To accomplish this feat of wizardry, just add the carat (“) to the beginning of your search string, which basically says, “at the beginning.”

II Matches
echo “Matches: ” preg_match(“/A(MrIDr). Smith!”,
“Dr. Smit~”) . “\n”;
II Does NOT match
echo “Matches: ” . preg_match(“/A(MrIDr). Smith/” ,
” Dr. Smith”) . “\n”

Looking back in the first case, /A (Mr IDr). Smith/ matches “Dr. Smith” because the string begins with “Dr. Smith” (“Mr. Smith” would be okay, too). But the second string does not match, because the 1\ rejects the leading spaces.

Taking this back to your query runner, you’d do something like this:

In the first case, there’s no match because the regular expression, which uses $, doesn’t allow for the trailing spaces in “Dr. Smith “. The second check does match, though, because there’s no leading space (which matches the 1\ (Mr IDr) part) and no trailing space (which matches the Smi.th$ part).

In fact. when you have a 1\ at the beginning of your expression and a $ at the end, you’re requiring an exact match not just within the search string but to the string itself. It’s like you’re saying that the search string should equal the regular expression. Of course if you were doing a real equivalency in PHP (with == or ===), you couldn’t have those nifty or statements with I,or any of the other cool things regular expressions offer

Ditch trim and strtoupper

As long as you’re simplifying your code with some regular expression goodness, try ta’king things further. Right now, you’re converting $query _text to all uppercase characters by using strtoupper and then searching for “CREATE”, “INSERT”, and the like within that uppercase version of the query.

But. regular expressions are happy to be case-insensitive, meaning that they don’t care whether they match uppercase or lowercase versions of a word. Just add an “i” to the end of your expression, ,”re the closing forward slash

php and MySQL

php and MySQL

What about trimming white space? Well, you really don’t need to trim $query _string; instead, in your regular expression, you just want to ignore leading spaces. At least, that’s the result you want. In PHP, you have to think of it this way:

1. Begin by matching any number of spaces-including when there are no spaces.

2. Then, after some indeterminate number of spaces, look for (CREATEIiNSERT IUPDATEIDELETEIDROP

This means that while you’re ignoring those spaces in your particular situation-figuring out whether the query is a CREATE, or UPDATE, or whatever-you’re really just doing another type of matching.

Now, you know how to match a space: you just include it in your regular expression. For example, IA Mr. Smithl requires an opening space. “Mr. Smith” would not match, but” Mr. Smith” would

But, that requires a space. How can you say that more than one space is okay? That’s when you need + (plus) character. The + character says, “The thing that came just before me can appear any number of times

Back to Square One

Second, your code is more sensible. It starts with the presumption that you’ll return rows. Then, based on a condition, it might change that presumption. This is natural human logic: start one way, if something else is going on, go another way. That’s a lot better than the sort of backward-logic of your earlier version

And you did it without a lot of messy and obscure hard-to-read code. (Well. it might be a little tricky for your friends still scared off by regular expressions. But. now you can teach them what’s up, and that’s a good thing, too.)

Searching for Sets of Characters

Now that you’ve taken care of leading spaces, you need to handle what your user types regardless of case and extra line breaks, like the example in Figure 6-2. Not only is there questionable use of the Shift key, there might also be leading spaces. But even if there isn’t leading space, there’s something else here: a return. Your clever, endearing users have done something you’d probably never think about:

They pressed Enter a few times before typing in their SOL

Your regular expression might not handle the query in Figure 6-2 as a DROP, despite you handling leading spaces and issues with capitalization. That’s because Enter produces some special characters, usually either \n, or in some situations, \r\n, or, just to keep things interesting, occasionally just

php and MySQL

php and MySQL

So, what can you do? Well, it’s easy to account for multiple characters like this: the regular expression \n* will match any number of new lines, and \r* will match any number of carriage returns. But what about \r\n? \r*\n* would match that, but what about spaces? You could do \r*\n* * and match Enter followed by spaces, but if you start to think about spaces and then Enters and then more spaces …and more Enters …(you get the idea

Of course, the whole point of regular expressions is to get away from that sort of thing. To do so, you search for ai y of a set of characters. That’s really what you want: accept any number (including zero) of any of a set of characters, a \r, a \n, or a space. You don’t care how many appear, or in what order, either.

This code handles spaces, the two flavors of new lines, and tosses in \ t for tab characters:No matter how many leading spaces, tabs, or new lines there are, your regular expression is happy to handle them. In fact. this sort of whitespace matching is so common that regular expressions can use \s as an abbreviation for [ \ t\r\n l. And, you can simplify things even further:

$return_rows = true;
if (preg_match(“/’\s*(CREATEIINSERTIUPDATEIDELETEIDROP)/i”,
$query_text)) {
$return_rows = false;

Try this out. Enter the SQL shown back in Figure 6-2 and submit your query. You’ll probably get something similar to Figure 6-3, which means you’re not done yet. The problem here isn’t your regular expression. It’s really that you’re trying to pass into mysql_ query some queries that haven’t been screened much for problems-like all those extra \r\ns at the beginning.

connected MySQL

connected MySQL

In fact, there are lots of queries that will create problems for rUfl_clUf::ryphp, regardless of how clean your regular expression code is. Try entering this query

That might seem simple enough, but it’s still going to break your script. It doesn’t matter whether you have anything in the uns table; you’ll still get an error, as shown in Figure 6-4 .

connected MySQL

connected MySQL

Frankly, you could spend weeks writing all the code required to handle every possible SQL query, make sure the right things are accepted and the wrong ones aren’t, and to handle alfthe various types of queries.

But that’s not a good idea. Just taking in any old SQL query is, in fact, a very bad idea. What’s a much better idea is to take a step back and think about what your users’really need. It’s probably not a blank form, and so in the next chapter, you’ll give them what they need: a normal web form that just happens to talk to MySQL on the back end.

Regular Expressions: To Infinity and Beyond

It’s not an over-exaggeration to say you’ve just barely scratched the surface of regular expressions. Although you have a strong grasp of the basics-from matching to 1\ and $ and the various flavors of preg_ match, from position and whitespace to + and * and sets-there are more than a few trees that have sacrificed themselves to produce all the paper out there with text on regular expressions

But don’t be freaked out or daunted, and don’t think you have to  top working your PHP and MySQL skills until you’ve mastered regular expressions. First, mastery is elusive, and even the best regular expression programmers use Google to refresh their. memories on how to get just the right sequence of characters within their slashes. Just be on the lookout for chances to use regular expressions. And, as you get better at PHP, you’ll use them more often, and they’ll slowly become as familiar to you as PHP, or HTML, or any of the other things you’ve been doing over and over.

Regular Expressions Aren’t Just for PHP

As you’re probably seeing, it does take some work to get very far with regular expressions. There are lots of weird characters both to find on your keyboard, and to work into your expressions. Without a doubt, it doesn’t take long for a regular expression to start to look like something OBert might say:

But, the work rewards you in more ways than you might realize. For instance, JavaScript has complete support for regular expressions, too. Methods like replace () in JavaScript take in regular expressions, as do the match () methods on strings. So,everything you’ve learned in PHPtranslates over, perfectly. You also get some nice benefits in HTMLS.You can use regular expressions in an HTMLSform to provide patterns against which data is validated. Take heart; this work in PHP is helping you out in almost every aspect of web programming

The moral of this story? What you’re learni/lg about SOl applies to more than MySOl, and what you’re learning about regular expressions applies to more than PHP.Your skills are growing; use them

A Little Cleanup: Remove the echo Statements

Before moving on, there’s just one last thing you need to take care of. Right now, your Ciafdbasi:’_ connection.ono script should look like this:
<7php
require ‘app_config.php’;
mysql_connect(DATABASE_HOST, DATABASE_USERNAME, DATABASE_PASSWORD)
or die(“<p>Error connecting to database: ”
mysql_errorO . “</p>”);
echo “<p>Connected to MySOL!</p>”;
mysql_select_db(DATABASE_NAME)
or die(“<p>Error selecting the database”
DATABASE_NAME. mysql_errorO . “</p>”);
echo “<p>Connected to MySOL, using database ”
DATABASE_NAME. “.</p>”;
?>

There’s nothing wrong here, and it’s quite informative with those echo statements. But, in the next chapter and beyond, you’re going to start responding in your PHP scripts by using HTML rather than plain old text. As you’ll soon see, your PHP will .isueitv send back HTML when its called and interpreted

Now, when your scripts respond with HTML, and they require or include  you really don’t want those echo statements. They’ll show up before your script’s HTML, and generally look like either debugging information or a programming error. So, go ahead and get rid of those. When you’re done, (i,ltabase. ccnnectian.ono should look like this

<?php
require ‘app_config.php’;
mysql_connect(DATABASE_HOST, DATABASE_USERNAME, DATABASE_PASSWORD)
or die(“<p>Error connecting to database: ”
mysql_error() . “</p>”);
mysql_select_db(DATABASE_NAME)
or die(“cp>Error selecting the database”
DATABASE_NAME. mysql_error() . “</p>”);
?>

 

Posted on January 12, 2016 in Installing PHP on Windows Without WAMP

Share the Story

Back to Top