English syntax (part two)

Context-free grammars (CFGs)

A simple model of syntax allows us to describe some of syntactic structures of English, and it is also frequently used for describing the syntax of programming languages, a.k.a Back-Naur form.

这可以这样理解,上下文无关意味着句子语法结构符合语法即可,可以抛去语境所带来的影响。

eg. I run lake.

  • Informal description of the definition
  1. Terminals, English words.
  2. Nonterminal, categories of constituents.
  3. Rules, including how to put certain constituents together to from bigger constituents.
  4. Start symbol, S is special nonterminal that represent complete string, sentence.
  • Examples CFG
  1. S ⇒ NP VP : I want a morning flight.
  2. NP ⇒ Pronoun | ProperNoun | Det Nominal : I | Boston | a flight
  3. Nominal ⇒ Nominal Noun | Noun : morning flight | flights
  4. VP ⇒ Verb | Verb NP | VP PP : want a flight | leave Boston at night | leaving on Thursday
  5. PP ⇒ Preposition NP : from Boston
  6. Pronoun ⇒ I | you | he
  7. ProperNoun ⇒ Boston | Paris
  8. Det ⇒ the | a | an
  • The arrow ⇒ can be read as “can be expanded as”, or “the left-hand side (LHS) can be rewrite to the right-hand side (RHS)”.

  • The vertical bar “|” separates alternatives.

  • The recursion in the grammar

  1. Part of a Nominal can itself be a Nominal, and this is a form of direct recursion.
  2. Recursion can also be indirect.
    eg. In an extended CFG, a VP could be part of a NP, while a NP could be a part of a VP.
  • Each RHS consists of a single terminal (i.e. word)

Some sentence types in English

  • Declaratives: A plane left : S ⇒ NP VP
  • Imperatives : Leave! : S ⇒ VP
  • Yes/No questions : Dis the plane leave? : S ⇒ Aux NP VP
  • Wh subject questions : Which flights serve breakfast? : S ⇒ Wh-NP VP
  • Wh non-subject questions : Which flight did you book? : S ⇒ Wh-NP Aux NP VP

Meaning (applications)of CFGs

  • Generating strings
  • Accepting/rejecting strings
  • Assign structure to accepted strings
    The third is called parsing : taking a string (and a grammar) and computing the structure of the string according to the grammar. This structure is called the parse tree or parse.

Derivations

A derivation is a sequence of rules (starting with start symbol, normally named S) used to derive a sentence (string of terminal)
eg.
S ⇒ NP VP ⇒ Pronoun VP ⇒ I VP ⇒ I Verb NP ⇒
I prefer NP ⇒
I prefer Det Nominal ⇒
I prefer a Nominal ⇒
I prefer a Nominal Noun ⇒ I prefer a Noun Noun ⇒
I prefer a morning Noun ⇒ I prefer a morning flight

Grammaticality

  • If a sentence has a least one derivation, it is said to be grammatical.
  • A set of sentences that can be derived by a given CFG is called a context-free language.
  • English and other natural language is too intricate, and there is no CFG that generates all and only English sentences. But as far as English (or any natural language) can be described by formal grammars, it seems to be roughly context-free.

Spoken language

  • It is difficult to capture informal spoken language, due to speech disfluencies, i.e. phenomena such as repairs, use of fillers (eg. uh) etc.
    eg. He was wearing a black — uh, I mean a blue, a blue shirt.
  • We regard such issues as mostly separate from the syntax of written language.

Left-most derivations

  • There may be several ways to derive the same thing, but we can avoid this by demanding that rewriting should always be left-most.
    eg. NP VP ⇒ Pronoun VP ⇒ Pronoun Verb NP
  • When a derivation is left-most, we write ⇒LM
  • A left-most derivation of input w using rules d = π1 · · · πm , where πi = (Ai → αi ), we will denote as S ⇒dLM w

Parse trees

Parses and left-most derivation are very related concepts, and sometimes they are synonymous.
在这里插入图片描述

Bracketed notation

  • Bracketed notation is a linear notation to indicate that certain sequences are categories of certain class.
    eg. 在这里插入图片描述

References

The NLP slides of University of St Andrews

发布了70 篇原创文章 · 获赞 4 · 访问量 3051

猜你喜欢

转载自blog.csdn.net/qq_34515959/article/details/104905143
two
今日推荐