Pcre fullinfo

Извлекает информацию о шаблоне
Функция возвращает информацио о скомпилированном регулярном выражении. Эта функция является заменой устаревшей pcre_info, which is neverthe- less retained for backwards compability (and is documented below).

Первый аргумент для pcre_fullinfo - указатель на скомпилированный шаблон. Второй аргумент - результат работы pcre_study или NULL, если шаблон не анализировался. Третmbv аргументом указывается, какую информацию требуется положить в память, на которую ссылается 4й аргумент. Функция возвращает 0 в случае успеха, в противном случае - отрицательные числа-коды ошибки:

PCRE_ERROR_NULL      the argument code was NULL the argument where was NULL PCRE_ERROR_BADMAGIC  the "magic number" was not found PCRE_ERROR_BADOPTION the value of what was invalid

The "magic  number" is placed at the start of each compiled pattern as an simple check against passing an arbitrary memory pointer. Here is a typical  call  of pcre_fullinfo, to obtain the length of the compiled pattern: The possible values for the third argument are defined in pcre.h,  and are as follows:

PCRE_INFO_BACKREFMAX

Return the  number  of  the highest back reference in the pattern. The fourth argument should point to an int variable. Zero is  returned  if there are no back references.

PCRE_INFO_CAPTURECOUNT

Return the  number of capturing subpatterns in the pattern. The fourth argument should point to an int variable.

PCRE_INFO_DEFAULT_TABLES

Return a pointer to the internal default character tables within PCRE. The fourth  argument should point to an unsigned char * variable. This information call is provided for internal use by the pcre_study func- tion. External callers  can  cause PCRE to use its internal tables by passing a NULL table pointer.

PCRE_INFO_FIRSTBYTE

Return information about the first byte of any matched string,  for  a non-anchored  pattern. The fourth argument should point to an int vari- able. (This option used to be called PCRE_INFO_FIRSTCHAR; the old name is still recognized for backwards compatibility.)

If there  is  a  fixed first byte, for example, from a pattern such as (cat|cow|coyote), its value is returned. Otherwise, if either

(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch starts with "^", or

(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set (if it were set, the pattern would be anchored),

-1 is returned, indicating that the pattern matches only at the  start of a  subject string or after any newline within the string. Otherwise -2 is returned. For anchored patterns, -2 is returned.

PCRE_INFO_FIRSTTABLE

If the pattern was studied, and this resulted in the construction of a 256-bit table indicating a fixed set of bytes for the first byte in any matching string, a pointer to the table is returned. Otherwise NULL is returned. The fourth argument should point to an unsigned char * vari- able.

PCRE_INFO_HASCRORLF

Return 1 if the pattern contains any explicit matches  for  CR  or  LF characters,  otherwise  0. The fourth argument should point to an int variable. An explicit match is either a literal CR or LF character, or \r or \n.

PCRE_INFO_JCHANGED

Return 1  if  the (?J) or (?-J) option setting is used in the pattern, otherwise 0. The fourth argument should point to an int variable. (?J) and (?-J) set and unset the local PCRE_DUPNAMES option, respectively.

PCRE_INFO_LASTLITERAL

Return the  value of the rightmost literal byte that must exist in any matched string, other than at its start,  if  such  a  byte  has  been recorded. The fourth argument should point to an int variable. If there is no such byte, -1 is returned. For anchored patterns, a last literal byte is  recorded only if it follows something of variable length. For example, for the pattern /^a\d+z\d+/ the returned value is "z", but for /^a\dz\d/ the returned value is -1.

PCRE_INFO_MINLENGTH

If the  pattern  was studied and a minimum length for matching subject strings was computed, its value is returned. Otherwise the  returned value is  -1. The value is a number of characters, not bytes (this may be relevant in UTF-8 mode). The fourth argument should point to an int variable. A non-negative  value is a lower bound to the length of any matching string. There may not be any strings of that length  that  do actually match, but every string that does match is at least that long.

PCRE_INFO_NAMECOUNT PCRE_INFO_NAMEENTRYSIZE PCRE_INFO_NAMETABLE

PCRE supports the use of named as well as numbered capturing parenthe- ses. The names are just an additional way of identifying the parenthe- ses, which still acquire numbers. Several convenience functions such as pcre_get_named_substring are provided for extracting  captured  sub- strings by  name. It is also possible to extract the data directly, by first converting the name to a number in order to access  the  correct pointers in the output vector (described with pcre_exec below). To do the conversion, you need to  use  the  name-to-number  map,  which  is described by these three values.

The map consists of a number of fixed-size entries. PCRE_INFO_NAMECOUNT gives the number of entries, and PCRE_INFO_NAMEENTRYSIZE gives the size of each  entry;  both  of  these  return  an int value. The entry size depends on the length of the longest name. PCRE_INFO_NAMETABLE returns a pointer  to  the  first  entry of the table (a pointer to char). The first two bytes of each entry are the number of the capturing parenthe- sis, most  significant byte first. The rest of the entry is the corre- sponding name, zero terminated.

The names are in alphabetical order. Duplicate names may appear if (?| is used to create multiple groups with the same number, as described in the section on duplicate subpattern numbers in  the  pcrepattern  page. Duplicate  names  for  subpatterns with different numbers are permitted only if PCRE_DUPNAMES is set. In all cases  of  duplicate  names,  they appear  in  the table in the order in which they were found in the pat- tern. In the absence of (?| this is the  order  of  increasing  number; when (?| is used this is not necessarily the case because later subpat- terns may have lower numbers.

As a simple example of the name/number table, consider  the  following pattern (assume  PCRE_EXTENDED is set, so white space - including new- lines - is ignored):

(? (? (\d\d)?\d\d) - (? \d\d) - (? \d\d) )

There are four named subpatterns, so the table has four  entries,  and each entry  in the table is eight bytes long. The table is as follows, with non-printing bytes shows in hexadecimal, and undefined bytes shown as ??: 00 01 d a  t  e  00 ?? 00 05 d a  y  00 ?? ?? 00 04 m o  n  t  h  00 00 02 y e  a  r  00 ?? When writing  code  to  extract  data from named subpatterns using the name-to-number map, remember that the length of the entries is  likely to be different for each compiled pattern.

PCRE_INFO_OKPARTIAL

Return 1  if  the  pattern  can  be  used  for  partial  matching with pcre_exec, otherwise 0. The fourth argument should point to  an  int variable. From release  8.00,  this  always  returns  1,  because the restrictions that previously applied to  partial  matching  have  been lifted. The pcrepartial documentation gives details of partial match- ing.

PCRE_INFO_OPTIONS

Return a copy of the options with which the pattern was compiled. The fourth argument  should  point to an unsigned long int variable. These option bits are those specified in the call to pcre_compile, modified by any top-level option settings at the start of the pattern itself. In other words, they are the options that will be in force when  matching starts. For example, if the pattern /(?im)abc(?-i)d/ is compiled with the PCRE_EXTENDED option, the result is PCRE_CASELESS, PCRE_MULTILINE, and PCRE_EXTENDED.

A pattern  is  automatically  anchored by PCRE if all of its top-level alternatives begin with one of the following:

^    unless PCRE_MULTILINE is set \A   always \G   always .*   if PCRE_DOTALL is set and there are no back references to the subpattern in which .* appears

For such patterns, the PCRE_ANCHORED bit is set in the options returned by pcre_fullinfo.

PCRE_INFO_SIZE

Return the  size  of the compiled pattern, that is, the value that was passed as the argument to pcre_malloc when PCRE was getting memory in which to place the compiled data. The fourth argument should point to a size_t variable.

PCRE_INFO_STUDYSIZE

Return the size of the data block pointed to by the study_data field in a pcre_extra  block. That is,  it  is  the  value that was passed to pcre_malloc when PCRE was getting memory into which to place the data created by  pcre_study. If pcre_extra is NULL, or there is no study data, zero is returned. The fourth argument should point to  a  size_t variable.