General

ISHML stands for Interactive Story grapH Management Library, but you can call it "Ishmael." It facilitates the creation of interactive fiction in JavaScript and is intended for client-side applications.

The ISHML library is a fluent API with straightforwardly named properties and methods, many of which are chainable.

The library exposes its methods and properties via a global variable, ishml.

The new operator is optional. ISHML constructors always return a new object instance when called.

The ISHML library is intended for modern browsers and relies on features of JavaScript found in the ES2016 specification.

See tutorials for examples of use.

.Interpretation()

An Interpretation represents one possible parsing of a string of characters in the context of a specific lexicon and grammar.

Constructor

Instances of ishml.Interpretation are not intended to be created by calling the object's constructor. Instead, they are generated through calls to ishml.Parser.analyze() or ishml.Rule.parse().

Properties

.gist Object

The gist contains the syntax tree resulting from parsing. Parsing breaks down text into a sequence of terms. Each of these terms may then be broken down into a sequence of one or more sub-terms, recursively. This process forms a syntax tree from the text. The structure of the gist follows the structure of the grammar rule used to create it. The nodes of the syntax tree are properties named after the sub-rules that define them. These properties may in turn have other properties defined on them all the way down the syntax tree to the terminal nodes of the tree. A terminal property contains the matching token for the sub-rule. In the case where a sub-rule defines the maximum number of tokens to match to be more than one, the value stored in the terminal property is an array of tokens.

.remainder String

The remainder is the string of characters, if any, that were not matched during parsing. Successful parsing will result in a remainder with a string length of zero.

.valid Boolean

Indicates whether an interpretation was found to be valid. Defaults to true.

See also parsing tutorial.

.Lexicon()

A Lexicon stores a collection of tokens.

Constructor

ishml.Lexicon()

Returns an instance of the Lexicon object. Use of new operator is optional. Takes no argument.

Used by ishml.Parser.analyze() and ishml.Rule.parse().

Methods

.register(lexeme...).as(definition)

Adds tokens to the lexicon.

The lexeme is the string of characters to be matched. More than one lexeme may be specified for the same definition. The same lexeme may have multiple definitions, but only one definition is permitted for each call of the register method.

The definition may be a simple value or a complex object. It is an arbitrary payload to be retrieved when the lexeme is matched. A definition typically holds one or more references to objects and functions defined elsewhere in the application.

Returns the instance of ishml.Lexicon. This method is chainable.

.search(lexeme[, options])

Searches for full and partial matches in the lexicon.

Returns an array of search results. Each search result is a plain JavaScript object with a token property containing the matching token and, a remainder property containing the remaining unmatched string of characters from the lexeme argument.

The lexeme argument is a string of characters to be matched against the entries in the lexicon.

The options argument is a plain javaScript object with properties listed below that override the default behavior of search.

.caseSensitve Boolean

Defaults to false. Set to true for case sensitive searches.

.full Boolean

Defaults to false for partial matching. Set to true for full matching.

A partial match is a match of the lexicon entry's full lexeme against the initial characters of the lexeme argument, but not the other way around.

A full match matches all the characters in the lexeme argument against the lexicon entry with no characters leftover.

.lax Boolean

Applies to partial matching. Defaults to false. Set to true to return partial matches even if the next character in the lexeme argument does not match the separator or end of string.

.longest Boolean

Defaults to false. Set to true to return the longest match. Only applicable when full is set to false for partial matching.

.regex RegExp

Defaults to false. May be set to any regular expression. Causes the text supplied in the lexeme argument to be matched against the regular expression without searching the lexicon for definitions. Instead {fuzzy:true} is provided as the tokens definition.

.separator RegExp

Applies to partial matching. Defaults to /^\s+/, whitespace. May be set to any regular expression. For no separator, set to empty string. When lax is set to false, a potential partial match will only be considered a match if the next character in the lexeme argument matches the separator or is the end of string.

.unregister(lexeme[,definition])

Removes a definition from a token in the lexicon.

The lexeme argument is a string of characters identifying a token in the lexicon.

The definition argument is a JavaScript object matching the original definition under which the lexeme was registered. The function does a shallow comparison of the properties and values of the definition argument to the definition stored in the lexicon. If they are found to be equal, the definition in the lexicon is deleted.

If no definition argument is provided, all definitions connected with the lexeme argument are removed from the lexicon, which, in essence, deletes the token.

Returns the instance object of method. This method is chainable.

See also parsing tutorial.

.Parser()

A Parser object is a recursive descent parser, with backtracking, that works with a grammar and a lexicon to provide lexical, syntactic, and semantic analysis of input. It handles ambiguously defined tokens and grammars and may potentially generate multiple interpretations from the same text.

Recursive decent parsers are susceptible to exponential run times. Writing the grammar in such a way as to reduce backtracking can improve run times.

Constructor

ishml.Parser({lexicon:lexicon, grammar:rule})

Returns an instance of the Parser object. Use of new operator is optional.

The lexicon argument is an instance of the ishml.Lexicon object that should be used tokenizing input.

The rule argument is an instance of the ishml.Rule object that should be used for syntactic parsing and semantic analysis.

Properties

.lexicon

The ishml.Lexicon object that was specified in the constructor.

.grammar

The ishml.Rule object that was specified in the constructor.

Methods

.analyze(text)

Parses the text argument and return an array of ishml.Interpretation.

The text argument is a string of characters to be analyzed.

See also parsing tutorial.

.Rule()

A Rule object is a grammar rule that describes the syntax tree that will result from parsing some text with the rule. Rules are, in spirit, a JavaScript adaptation of EBNF notation.

Rules may be built from other rules and have an object structure that resembles the syntax tree that results when the rule's .parse() method is called.

The terminal sub-rules match the input text against a lexicon or a regular expression pattern.

Constructor

ishml.Rule()

Returns a new ishml.Rule object instance. Use of new operator is optional. Takes no argument.

Properties

Enumerable properties are of type ishml.Rule. They are created with the .snip() method, which forms a tree structure of rules, mirroring the intended syntax tree resulting from parsing.

The following non-enumerable properties set the rule's behavior when its .parse() method is called. These properties may be set directly or with the .configure() method.

.caseSensitve Boolean

Applies to terminal rules. Defaults to false. Set to true for case sensitive parsing.

.entire boolean

If set to true requires the rule to match the entire input text with no remainder in order to be considered a valid match. Defaults to false for partial matching of the input text.

.filter function

Filters the array of definitions associated with the token(s) to be processed when rule's .parse() method is called. Applies to terminal rules. Defaults to (definition)=>true. Returning true from the filter function indicates that the definition should be kept. Returning false removes the definition from the definitions array of the token in the resulting interpretation. A token that has no definitions left after filtering is consider a non-matching token for the rule.

.full Boolean

Applies to terminal rules. Defaults to false for partial matching. Set to true for full matching.

A partial match is a match of the lexicon entry's full lexeme against the initial characters of the text to be parsed, but not the other way around.

A full match matches all the characters of the text to be parsed against the lexicon entry with no characters leftover.

.greedy Boolean

Set to true to consider the longest possible array of terms fitting the rule's criteria. Only applicable when minimum and maximum are set to different values and maximum is greater than one. Applies to both terminal and non-terminal rules. Defaults to false, which generates all possible interpretations between minimum and maximum inclusively.

.keep Boolean

Includes the result of a rule's parsing in the final result of parent rule. Applies to both terminal and non-terminal rules. Defaults to true. Set to false to require the rule to parse succesfully, but skip its result.

.lax Boolean

Applies to partial matching in terminal rules. Defaults to false. Set to true to return partial matches even if the next character in the the text to be parsed does not match the separator or end of string.

.longest Boolean

Applies to terminal rules. Defaults to false. Set to true to return only the longest match from the lexicon. Only applicable when full is set to false for partial matching.

.maximum Integer

Sets the maximum number of times to repeat the rule. Applies to both terminal and non-terminal rules. Defaults to 1. To allow an indefinite number of repitions, set maximum to Infinity.

.minimum Integer

Sets the minimum number of times to repeat the rule.

Defaults to 1. Set minimum to 0 to make the rule optional.

.mismatch function

Modifies the rule's generated interpretations, according to the custom function assigned, in the event that the rule does not match the input text. Generally used to provide meta-data about the reason for the rule's failure. Each interpretation is passed to the function and a modified interpretation is returned. Typically, the interpretation's .valid property is set to .false and the interpretation's .gist property is modified to provide additional information. See example below.

//Example        
nounPhrase.noMatch=(interpretation)=>
{
    interpretation.gist.error=true
    interpretation.gist.errorMessage=
        `Expected end of nounPhrase. Found: "${interpretation.remainder}".`
    interpretation.valid=false
    return interpretation
}
nounPhrase.noun.noMatch=(interpretation)=>
{
    interpretation.gist.error=true
    interpretation.gist.errorMessage=
        "Expected noun. Found: "${interpretation.remainder}"
    interpretation.valid=false
    return interpretation
}
command.noMatch=(interpretation)=>
{
    interpretation.gist.error=true
    interpretation.gist.errorMessage=
        `Expected end of command. Found: "${interpretation.remainder}".`
    interpretation.valid=false
    return interpretation
}
command.verb.noMatch=(interpretation)=>
{
    interpretation.gist.error=true
    interpretation.gist.errorMessage=
        `Expected verb. Found: "${interpretation.remainder}"".`
    interpretation.valid=false
    return interpretation   
}

.mode all | any | apt

Sets parsing mode for sub-rules of a rule. Applies to non-terminal rules. Defaults to ishml.enum.all, which treats the sub-rules as part of a sequence, each of which must parse successfully in order for the parent rule to be considered successfully parsed. The syntax trees generated by the sub-rules are appended to the node generate by the parent rule.

ishml.enum.any treats each sub-rule as a choice. At least one sub-rule must parse successfully in order for the rule to parse successfully. If more than one choice parses successfully, multiple alternative interpretations are generated. The resulting sub-tree generated by the sub-rule has its root node removed and becomes the syntax tree generated by the parent rule.

ishml.enum.apt treats each sub-rule as a choice. At least one choice must parse successfully in order for the rule to parse successfully. Parsing of sub-rules stops after the first successful choice is parsed and only one interpretation is generated. The resulting sub-tree generated by the sub-rule has its root node removed and becomes the syntax tree generated by the parent rule.

.regex RegExp

Applies to terminal rules. Defaults to false for lexicon search. May be set to any regular expression. Causes the text to parsed to be matched against the regular expression without searching the lexicon for definitions. Instead {fuzzy:true} is provided as the token's definition.

.semantics function

Checks the rule's generated syntax tree for semantic correctness and optionally edit the syntax tree. Applies to non-terminal rules. Defaults to (interpretation)=>true, which accepts all interpretations as semantically correct. Returning false removes the interpretation from further consideration. Returning true allows the interperation to continue processing. Optionally, you may alter the content interpretation.gist and return the altered interpretation as alternative to returning true.

.separator RegExp

Applies to partial matching in terminal rules. Defaults to /^\s+/, whitespace. May be set to any regular expression. For no separator, set to empty string. When lax is set to false, a potential partial match will only be considered a match if the next character in the input text matches the separator or is the end of string.

Methods

.clone()

Creates a deep copy of the rule.

.configure(options)

Configures behavior of rule.

The options argument is a plain javaScript object with properties that are the same as the non-enumerable properties of ishml.Rule.

Returns the rule. This method is chainable.

.parse(tokenization)

Parses a tokenization into one or more interpretations.

If the rule contains sub-rules the parse method of each sub-rule is called recursively to build the syntax tree. If the rule has no sub-rule, the rule is a terminal rule and the next token(s) in the tokenizations will be processed.

Returns an array of interpretations.

.snip(key [, rule])

Creates a new ishml.rule instance as an enumerable property of the rule.

The key argument is the name to be used for the sub-rule and may be a string or integer. If the sub-rule is to be accessed using dot notation, the requirements for dot notation must be observed when naming the key. For convenience, spaces are automatically converted to underscores.

The rule argument is the ishml.Rule instance to be assigned to the new property. Cloning of rule is recommended, for example, command.snip("subject",nounPhrase.clone()), unless the rule is being defined recursively.

If rule is omitted, a new instance of ishml.Rule is used.

Returns the rule. This method is chainable.

See also parsing tutorial.

.Token()

A token object is the smallest unit of text along with its definition that is meaningful to an application.

Constructor

Instances of ishml.Token are not intended to be created by calling the object's constructor. Instead, they are produced (directly or indirectly) as a result of calls to ishml.Lexicon.search(), ishml.Lexicon.tokenize(), or ishml.Parser.analyze().

Properties

.lexeme String

is the string of characters that identifies the token.

.definitions Array

is an Array of definitions retrieved from the Lexicon that give meaning to the lexeme. A definition may be a simple value or a complex object. It is an arbitrary payload. A definition typically holds one or more references to objects and functions defined elsewhere in the application.

See also parsing tutorial.