Parser and Parsing
A parser is a program that enables the translation of source code, which is the original form of program instruction, into object code, or language that the computer is able to understand. As such, parsing is part of the compilation process. The mechanics of dividing the source code into it component parts and identifying those components, so that translation into object code can occur, constitutes parsing. Parsers are available for all standard programming languages, such as XML, Perl and Java.
The parse function has and input series to which specific rules of analysis are applied. The input series may be a group of characters (string) or numerical data (block). If it is a sting, it will be parsed by character. If it is a block, it will be parsed by value. The rules supply the parameters that determine how the information will be parsed. The parse function can be modified in two ways. The function "/all refinement" forces a parsing of all the characters in a string, including aspects usually ignored by the parser--spaces, tabs, new lines, and other non-printable characters. The "/case refinement" specifies that the string to be parsed be done so in a case sensitive manner. Normally, upper and lower cases are not distinguished.
Parsing is vital to the operation of many computer science disciplines. The compiler, the program that actually accomplishes the source code to object code translation, does so by examining the entire source code and then reorganizing the information. While the parser is the program that dissects the information, the compiler is essentially the language that specifies the way parsing is accomplished. Without the compiler and the ability to parse the information, the reorganization of the information could not occur. Parsing also is important for many applications, which require the processing of commands.
Parsing constitutes lexical analysis and semantic analysis. Lexical analysis divides strings, which are a series of characters grouped together, into their components, which are called tokens. For example, a sentence could be divided into several tokens corresponding to a noun phrase, verb phrase, and prepositional phrase. Semantic parsing then functions to try and determine the meaning of the string. In the above example, semantic parsing would establish the nouns, verb, preposition, and article constituents. A drawing of the process, with the string at the top and the completely parsed string components at the bottom, is reminiscent of an evergreen tree, with its broad base tapering upwards to a tip. The visual image inspired the name treebank, which refers to the arrangement of the parsing process.
The parsing process utilizes a form of grammar known as constraint grammar. In constraint grammar, the grammatical functions of words in a sentence and the relationships between these words are codified. Not all parsing systems are the same. There are two main differences: the number of constituent types, or tools used to dissect the grouped information, that a system uses, and the way in which the constituent types are allowed to combine with each other. Despite these differences, most parsing schemes are based on a form that is known as context-free structure grammar. Within this form are two distinctly different forms of parsing; full parsing and skeleton parsing.
Full parsing aims to provide the highest level of detail possible of the structure of the grouped information. Skeleton parsing is less detailed. For example, in full parsing there are several types of noun phrases, which are distinguished by features such as their singular or plural nature. Several constituent labels are also used for adjective phrases, prepositional phrases, adverbial phrases, and verb phrases. Each label has a character code. Skeleton parsing, on the other hand, labels all noun phrases the same, with the letter N. There are fewer codes in skeleton parsing than in full parsing (relative clause, noun phrase, prepositional phrase, compound sentence, verb phrase).
Parsing is a term applied to other, computer-dependent tasks. The medium of digital video also involved parsing. Digital video needs to be properly processed before it can be inserted into a video server. One processing need is parsing. Video parsing is the process of detecting scene changes or the transition from one camera shot to another in a video montage, which is often in a compressed state in a JPEG or MPEG file. Another aspect of parsing is found in biology, specifically in the emerging field of proteomics--the mapping of the identity and activities of all the proteins in a cell. Parsing comes in at the level of the genetic material. Deciphering the sequences of DNA that code for the thousands of functional proteins that operate in humans requires vast computational power.
This is the complete article, containing 766 words
(approx. 3 pages at 300 words per page).