bytesmith-0.3.11.1: Nonresumable byte parser
Safe HaskellNone
LanguageHaskell2010

Data.Bytes.Parser.Latin

Description

Parse input as though it were text encoded by ISO 8859-1 (Latin-1). All byte sequences are valid text under ISO 8859-1.

Synopsis

Matching

Required

char :: e -> Char -> Parser e s () #

Consume the next character, failing if it does not match the expected value or if there is no more input.

char2 :: e -> Char -> Char -> Parser e s () #

Consume the next two characters, failing if they do not match the expected values.

char2 e a b === char e a *> char e b

char3 :: e -> Char -> Char -> Char -> Parser e s () #

Consume three characters, failing if they do not match the expected values.

char3 e a b c === char e a *> char e b *> char e c

char4 :: e -> Char -> Char -> Char -> Char -> Parser e s () #

Consume four characters, failing if they do not match the expected values.

char4 e a b c d === char e a *> char e b *> char e c *> char e d

char5 :: e -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume five characters, failing if they do not match the expected values.

char6 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume six characters, failing if they do not match the expected values.

char7 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume seven characters, failing if they do not match the expected values.

char8 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume eight characters, failing if they do not match the expected values.

char9 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume nine characters, failing if they do not match the expected values.

char10 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume ten characters, failing if they do not match the expected values.

char11 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume eleven characters, failing if they do not match the expected values.

char12 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () #

Consume twelve characters, failing if they do not match the expected values.

Try

trySatisfy :: (Char -> Bool) -> Parser e s Bool #

Runs the predicate on the next character in the input. If the predicate is matched, this consumes the character. Otherwise, the character is not consumed. This returns False if the end of the input has been reached. This never fails.

trySatisfyThen #

Arguments

:: Parser e s a

Default parser. Runs on Nothing or end of input.

-> (Char -> Maybe (Parser e s a))

Parser-selecting predicate

-> Parser e s a 

Runs the function on the next character in the input. If the function returns Just, this consumes the character and then runs the parser on the remaining input. If the function returns Nothing, this does not consume the tested character, and it runs the default parser on the input (which includes the tested character). If there is no input remaining, this also runs the default parser. This combinator never fails.

One Character

any :: e -> Parser e s Char #

Consumes and returns the next character in the input.

opt :: Parser e s (Maybe Char) #

Consume a character from the input or return Nothing if end of the stream has been reached. Since ISO 8859-1 maps every bytes to a character, this parser never fails.

opt# :: Parser e s (# (# #) | Char# #) #

Variant of opt with unboxed result.

Many Characters

takeTrailedBy :: e -> Char -> Parser e s Bytes #

Take characters until the specified character is encountered. Consumes the matched character as well. Fails if the character is not present. Visually, the cursor advancement and resulting Bytes for takeTrailedBy 'D' look like this:

 A B C D E F | input
|->->->-|    | cursor
{\-*-*-\}      | result bytes

Lookahead

peek :: Parser e s (Maybe Char) #

Match any character, to perform lookahead. Returns Nothing if end of input has been reached. Does not consume any input.

Note: Because this parser does not fail, do not use it with combinators such as many, because such as many, because such parsers loop until a failure occurs. Careless use will thus result in an infinite loop.

peek' :: e -> Parser e s Char #

Match any byte, to perform lookahead. Does not consume any input, but will fail if end of input has been reached.

Skip

skipDigits :: Parser e s () #

Skip the characters 0-9 until a non-digit is encountered. This parser does not fail.

skipDigits1 :: e -> Parser e s () #

Variant of skipDigits that requires at least one digit to be present.

skipChar :: Char -> Parser e s () #

Skip the character any number of times. This succeeds even if the character was not present.

skipChar1 :: e -> Char -> Parser e s () #

Skip the character any number of times. It must occur at least once or else this will fail.

skipTrailedBy :: e -> Char -> Parser e s () #

Skip all characters until the terminator is encountered and then consume the matching character as well. Visually, skipTrailedBy 'C' advances the cursor like this:

 A Z B Y C X C W
|->->->->-|

This fails if it reaches the end of input without encountering the character.

skipUntil :: Char -> Parser e s () #

Skip all characters until the terminator is encountered. This does not consume the terminator. Visually, skipUntil 'C' advances the cursor like this:

 A Z B Y C X C W
|->->->-|

This succeeds if it reaches the end of the input without encountering the terminator. It never fails.

skipWhile :: (Char -> Bool) -> Parser e s () #

Skip while the predicate is matched. This is always inlined.

End of Input

endOfInput :: e -> Parser e s () #

Fails if there is still more input remaining.

isEndOfInput :: Parser e s Bool #

Returns true if there are no more bytes in the input. Returns false otherwise. Always succeeds.

Numbers

Decimal

Unsigned

decWord :: e -> Parser e s Word #

Parse a decimal-encoded number. If the number is too large to be represented by a machine word, this fails with the provided error message. This accepts any number of leading zeroes.

decWord8 :: e -> Parser e s Word8 #

Parse a decimal-encoded 8-bit word. If the number is larger than 255, this parser fails.

decWord16 :: e -> Parser e s Word16 #

Parse a decimal-encoded 16-bit word. If the number is larger than 65535, this parser fails.

decWord32 :: e -> Parser e s Word32 #

Parse a decimal-encoded 32-bit word. If the number is larger than 4294967295, this parser fails.

decWord64 :: e -> Parser e s Word64 #

Parse a decimal-encoded unsigned number. If the number is too large to be represented by a 64-bit word, this fails with the provided error message. This accepts any number of leading zeroes.

Signed

decUnsignedInt :: e -> Parser e s Int #

Parse a decimal-encoded number. If the number is too large to be represented by a machine integer, this fails with the provided error message. This rejects input with that is preceeded by plus or minus. Consequently, it does not parse negative numbers. Use decStandardInt or decSignedInt for that purpose. On a 64-bit platform decWord will successfully parse 9223372036854775808 (i.e. 2 ^ 63), but decUnsignedInt will fail. This parser allows leading zeroes.

decUnsignedInt# :: e -> Parser e s Int# #

Variant of decUnsignedInt with an unboxed result.

decSignedInt :: e -> Parser e s Int #

Parse a decimal-encoded number. If the number is too large to be represented by a machine integer, this fails with the provided error message. This allows the number to optionally be prefixed by plus or minus. If the sign prefix is not present, the number is interpreted as positive. This allows leading zeroes.

decStandardInt :: e -> Parser e s Int #

Parse a decimal-encoded number. If the number is too large to be represented by a machine integer, this fails with the provided error message. This allows the number to optionally be prefixed by minus. If the minus prefix is not present, the number is interpreted as positive. The disallows a leading plus sign. For example, decStandardInt rejects +42, but decSignedInt allows it.

decTrailingInt #

Arguments

:: e

Error message

-> Int

Leading digit, should be between 0 and 9.

-> Parser e s Int 

Variant of decUnsignedInt that lets the caller supply a leading digit. This is useful when parsing formats like JSON where integers with leading zeroes are considered invalid. The calling context must consume the first digit before calling this parser. Results are always positive numbers.

decSignedInteger :: e -> Parser e s Integer #

Parse a decimal-encoded integer of arbitrary size. This accepts input that begins with a plus or minus sign. Input without a sign prefix is interpreted as positive.

decUnsignedInteger :: e -> Parser e s Integer #

Parse a decimal-encoded positive integer of arbitrary size. This rejects input that begins with a plus or minus sign.

decTrailingInteger #

Arguments

:: Int

Leading digit, should be between 0 and 9.

-> Parser e s Integer 

Variant of decUnsignedInteger that lets the caller supply a leading digit. This is useful when parsing formats like JSON where integers with leading zeroes are considered invalid. The calling context must consume the first digit before calling this parser. Results are always positive numbers.

Hexadecimal

Variable Length

hexWord8 :: e -> Parser e s Word8 #

Parse a hexadecimal-encoded 8-bit word. If the number is larger than 255, this parser fails. This allows leading zeroes and is insensitive to case. For example, 00A, 0a and A would all be accepted as the same number.

hexWord16 :: e -> Parser e s Word16 #

Parse a hexadecimal-encoded 16-bit word. If the number is larger than 65535, this parser fails. This allows leading zeroes and is insensitive to case. For example, 0100a and 100A would both be accepted as the same number.

hexWord32 :: e -> Parser e s Word32 #

Fixed Length

hexFixedWord8 :: e -> Parser e s Word8 #

Parse exactly two ASCII-encoded characters, interpretting them as the hexadecimal encoding of a 8-bit number. Note that this rejects a sequence such as A, requiring 0A instead. This is insensitive to case.

hexFixedWord16 :: e -> Parser e s Word16 #

Parse exactly four ASCII-encoded characters, interpreting them as the hexadecimal encoding of a 16-bit number. Note that this rejects a sequence such as 5A9, requiring 05A9 instead. This is insensitive to case. This is particularly useful when parsing escape sequences in C or JSON, which allow encoding characters in the Basic Multilingual Plane as \uhhhh.

hexFixedWord32 :: e -> Parser e s Word32 #

Parse exactly eight ASCII-encoded characters, interpreting them as the hexadecimal encoding of a 32-bit number. Note that this rejects a sequence such as BC5A9, requiring 000BC5A9 instead. This is insensitive to case.

hexFixedWord64 :: e -> Parser e s Word64 #

Parse exactly 16 ASCII-encoded characters, interpreting them as the hexadecimal encoding of a 64-bit number. Note that this rejects a sequence such as BC5A9, requiring 00000000000BC5A9 instead. This is insensitive to case.

Digit

hexNibbleLower :: e -> Parser e s Word #

Consume a single character that is the lowercase hexadecimal encoding of a 4-bit word. Fails if the character is not in the class [a-f0-9].

tryHexNibbleLower :: Parser e s (Maybe Word) #

Consume a single character that is the lowercase hexadecimal encoding of a 4-bit word. Returns Nothing without consuming the character if it is not in the class [a-f0-9]. The parser never fails.

hexNibble :: e -> Parser e s Word #

Consume a single character that is the case-insensitive hexadecimal encoding of a 4-bit word. Fails if the character is not in the class [a-fA-F0-9].

tryHexNibble :: Parser e s (Maybe Word) #

Consume a single character that is the case-insensitive hexadecimal encoding of a 4-bit word. Returns Nothing without consuming the character if it is not in the class [a-fA-F0-9]. This parser never fails.