Pattern analyzingEdit

pcre_extra *pcre_study(const pcre *code, int options
	 const char **errptr);

If a compiled pattern is going to be used several times, it is worth spending more time analyzing it in order to speed up the time taken for matching. The function pcre_study() takes a pointer to a compiled pat- tern as its first argument. If studying the pattern produces additional information that will help speed up matching, pcre_study() returns a pointer to a pcre_extra block, in which the study_data field points to the results of the study.

The returned value from pcre_study() can be passed directly to pcre_exec() or pcre_dfa_exec(). However, a pcre_extra block also con- tains other fields that can be set by the caller before the block is passed; these are described below in the section on matching a pattern.

If studying the pattern does not produce any useful information, pcre_study() returns NULL. In that circumstance, if the calling program wants to pass any of the other fields to pcre_exec() or pcre_dfa_exec(), it must set up its own pcre_extra block.

The second argument of pcre_study() contains option bits. At present, no options are defined, and this argument should always be zero.

The third argument for pcre_study() is a pointer for an error message. If studying succeeds (even if no data is returned), the variable it points to is set to NULL. Otherwise it is set to point to a textual error message. This is a static string that is part of the library. You must not try to free it. You should test the error pointer for NULL after calling pcre_study(), to be sure that it has run successfully.

This is a typical call to pcre_study():

  pcre_extra *pe;
  pe = pcre_study(
	re,             /* result of pcre_compile() */
	0,              /* no options exist */
	&error);        /* set to NULL or points to a message */

Studying a pattern does two things: first, a lower bound for the length of subject string that is needed to match the pattern is computed. This does not mean that there are any strings of that length that match, but it does guarantee that no shorter strings match. The value is used by pcre_exec() and pcre_dfa_exec() to avoid wasting time by trying to match strings that are shorter than the lower bound. You can find out the value in a calling program via the pcre_fullinfo() function.

Studying a pattern is also useful for non-anchored patterns that do not have a single fixed starting character. A bitmap of possible starting bytes is created. This speeds up finding a position in the subject at which to start matching.

The two optimizations just described can be disabled by setting the PCRE_NO_START_OPTIMIZE option when calling pcre_exec() or pcre_dfa_exec(). You might want to do this if your pattern contains callouts or (*MARK), and you want to make use of these facilities in cases where matching fails. See the discussion of PCRE_NO_START_OPTI- MIZE below.