Warning: Reason support is experimental. We are looking for beta-tester and contributors.

Module Xml_print.Utf8

module Utf8 : sig..end

Utf8 normalizer and encoder for HTML.

Given a module Htmlprinter produced by one of the functors in Xml_print, this modules is used as following:

let encode x = fst (Utf8.normalize_html x) in
    Htmlprinter.print ~encode document

type utf8 = string

normalize str take a possibly invalid utf-8 string and return a valid utf-8 string where invalid bytes have been replaced by the replacement character U+FFFD. The returned boolean is true if invalid bytes were found

val normalize : string -> utf8 * bool
val normalize_html : string -> utf8 * bool

Same as normalize plus some extra work : It encode '<' , '>' , '"' , '&' characters with corresponding entities and replaced invalid html character by U+FFFD

type encoding = 
  [ `ISO_8859_1 | `US_ASCII | `UTF_16 | `UTF_16BE | `UTF_16LE | `UTF_8 ]
val normalize_from : 
  encoding:[< encoding ] ->
  string -> utf8 * bool

normalize_from ~encoding str convert the string str into an uft-8 string. It assumes the encoding encoding and replace invalid bytes by the replacement character U+FFFD. The returned boolean is true if invalid bytes were found