module Make:
Character Information
type
general_category_type = [ `Cc
| `Cf
| `Cn
| `Co
| `Cs
| `Ll
| `Lm
| `Lo
| `Lt
| `Lu
| `Mc
| `Me
| `Mn
| `Nd
| `Nl
| `No
| `Pc
| `Pd
| `Pe
| `Pf
| `Pi
| `Po
| `Ps
| `Sc
| `Sk
| `Sm
| `So
| `Zl
| `Zp
| `Zs ]
Type of Unicode general character categories.
Each variant specifies
`Lu
: Letter, Uppercase
`Ll
: Letter, Lowercase
`Lt
: Letter, Titlecase
`Mn
: Mark, Non-Spacing
`Mc
: Mark, Spacing Combining
`Me
: Mark, Enclosing
`Nd
: Number, Decimal Digit
`Nl
: Number, Letter
`No
: Number, Other
`Zs
: Separator, Space
`Zl
: Separator, Line
`Zp
: Separator, Paragraph
`Cc
: Other, Control
`Cf
: Other, Format
`Cs
: Other, Surrogate
`Co
: Other, Private Use
`Cn
: Other, Not Assigned
`Lm
: Letter, Modifier
`Lo
: Letter, Other
`Pc
: Punctuation, Connector
`Pd
: Punctuation, Dash
`Ps
: Punctuation, Open
`Pe
: Punctuation, Close
`Pi
: Punctuation, Initial
`Pf
: Punctuation, Final
`Po
: Punctuation, Other
`Sm
: Symbol, Math
`Sc
: Symbol, Currency
`Sk
: Symbol, Modifier
`So
: Symbol, Other
val general_category : CamomileLibrary.UChar.t ->
general_category_type
val load_general_category_map : unit ->
general_category_type CamomileLibrary.UMap.t
type
character_property_type = [ `Alphabetic
| `Ascii_Hex_Digit
| `Bidi_Control
| `Default_Ignorable_Code_Point
| `Deprecated
| `Diacritic
| `Extender
| `Grapheme_Base
| `Grapheme_Extend
| `Grapheme_Link
| `Hex_Digit
| `Hyphen
| `IDS_Binary_Operator
| `IDS_Trinary_Operator
| `ID_Continue
| `ID_Start
| `Ideographic
| `Logical_Order_Exception
| `Lowercase
| `Math
| `Noncharacter_Code_Point
| `Other_Alphabetic
| `Other_Grapheme_Extend
| `Other_Lowercase
| `Other_Math
| `Other_Uppercase
| `Other_default_Ignorable_Code_Point
| `Quotation_Mark
| `Radical
| `Soft_Dotted
| `Terminal_Punctuation
| `Unified_Ideograph
| `Uppercase
| `White_Space
| `XID_Continue
| `XID_Start ]
Type of character properties
val load_property_tbl : character_property_type ->
CamomileLibrary.UCharTbl.Bool.t
Load the table for the given character type.
val load_property_tbl_by_name : string -> CamomileLibrary.UCharTbl.Bool.t
Load the table for the given name of the character type.
The name can be obtained by removing ` from its name of
the polymorphic variant tag.
val load_property_set : character_property_type ->
CamomileLibrary.USet.t
Load the set of characters of the given character type.
val load_property_set_by_name : string -> CamomileLibrary.USet.t
Load the set of characters of the given name of the character type.
The name can be obtained by removing ` from its name of
the polymorphic variant tag.
type
script_type = [ `Arabic
| `Armenian
| `Bengali
| `Bopomofo
| `Buhid
| `Canadian_Aboriginal
| `Cherokee
| `Common
| `Cyrillic
| `Deseret
| `Devanagari
| `Ethiopic
| `Georgian
| `Gothic
| `Greek
| `Gujarati
| `Gurmukhi
| `Han
| `Hangul
| `Hanunoo
| `Hebrew
| `Hiragana
| `Inherited
| `Kannada
| `Katakana
| `Khmer
| `Lao
| `Latin
| `Malayalam
| `Mongolian
| `Myanmar
| `Ogham
| `Old_Italic
| `Oriya
| `Runic
| `Sinhala
| `Syriac
| `Tagalog
| `Tagbanwa
| `Tamil
| `Telugu
| `Thaana
| `Thai
| `Tibetan
| `Yi ]
Type for script type
val script : CamomileLibrary.UChar.t -> script_type
val load_script_map : unit -> script_type CamomileLibrary.UMap.t
type
version_type = [ `Nc | `v1_0 | `v1_1 | `v2_0 | `v2_1 | `v3_0 | `v3_1 | `v3_2 ]
age
val age : CamomileLibrary.UChar.t -> version_type
age c
unicode version in wich
c
was introduced
older v1 v2
is true
if v1
is older ( or the same version )
than v2
. Everithing is older than `Nc
val older : version_type ->
version_type -> bool
casing
val load_to_lower1_tbl : unit -> CamomileLibrary.UChar.t CamomileLibrary.UCharTbl.t
val load_to_upper1_tbl : unit -> CamomileLibrary.UChar.t CamomileLibrary.UCharTbl.t
val load_to_title1_tbl : unit -> CamomileLibrary.UChar.t CamomileLibrary.UCharTbl.t
type
casemap_condition = [ `AfterSoftDotted
| `BeforeDot
| `FinalSigma
| `Locale of string
| `MoreAbove
| `Not of casemap_condition ]
type
special_casing_property = {
}
val load_conditional_casing_tbl : unit ->
special_casing_property list
CamomileLibrary.UCharTbl.t
val load_casefolding_tbl : unit -> CamomileLibrary.UChar.t list CamomileLibrary.UCharTbl.t
val combined_class : CamomileLibrary.UChar.t -> int
Combined class
A combined class is an integer of 0 -- 255, showing how this character
interacts to other combined characters.
Decomposition
type
decomposition_type = [ `Canon
| `Circle
| `Compat
| `Final
| `Font
| `Fraction
| `Initial
| `Isolated
| `Medial
| `Narrow
| `NoBreak
| `Small
| `Square
| `Sub
| `Super
| `Vertical
| `Wide ]
Types of decomposition.
type
decomposition_info = [ `Canonform
| `Composite of
decomposition_type *
CamomileLibrary.UChar.t list
| `HangulSyllable ]
val load_decomposition_tbl : unit ->
decomposition_info CamomileLibrary.UCharTbl.t
Canonical Composition
val load_composition_tbl : unit ->
(CamomileLibrary.UChar.t * CamomileLibrary.UChar.t) list
CamomileLibrary.UCharTbl.t
The return value [(u_1, u'_1); ... (u_n, u'_1)]
means
for the given character u
, u u_i
forms
the canonical composition u'_i
.
If u is a Hangul jamo, composition returns [].
val load_composition_exclusion_tbl : unit -> CamomileLibrary.UCharTbl.Bool.t
Whether the given composed character is used in NFC or NFKC