The toplevel system (camllight)

This chapter describes the toplevel system for Caml Light, that permits interactive use of the Caml Light system, through a read-eval-print loop. In this mode, the system repeatedly reads Caml Light phrases from the input, then typechecks, compile and evaluate them, then prints the inferred type and result value, if any. The system prints a # (sharp) prompt before reading each phrase. A phrase can span several lines. Phrases are delimited by ;; (the final double-semicolon).

From the standpoint of the module system, all phrases entered at toplevel are treated as the implementation of a module named top. Hence, all toplevel definitions are entered in the module top.

Unix:
The toplevel system is started by the command camllight. Phrases are read on standard input, results are printed on standard output, errors on standard error. End-of-file on standard input terminates camllight (see also the quit system function below).

The toplevel system does not perform line editing, but it can easily be used in conjunction with an external line editor such as fep; just run fep -emacs camllight or fep -vi camllight. Another option is to use camllight under Gnu Emacs, which gives the full editing power of Emacs (see the directory contrib/camlmode in the distribution).

At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by pressing ctrl-C (or, more precisely, by sending the intr signal to the camllight process). This goes back to the # prompt.

Mac:
The toplevel system is presented as the standalone Macintosh application Caml Light. This application does not require the Macintosh Programmer's Workshop to run.

Once launched from the Finder, the application opens two windows, ``Caml Light Input'' and ``Caml Light Output''. Phrases are entered in the ``Caml Light Input'' window. The ``Caml Light Output'' window displays a copy of the input phrases as they are processed by the Caml Light toplevel, interspersed with the toplevel responses. The ``Return'' key sends the contents of the Input window to the Caml Light toplevel. The ``Enter'' key inserts a newline without sending the contents of the Input window. (This can be configured with the ``Preferences'' menu item.)

The contents of the input window can be edited at all times, with the standard Macintosh interface. An history of previously entered phrases is maintained, and can be accessed with the ``Previous entry'' (command-P) and ``Next entry'' (command-N) menu items.

To quit the Caml Light application, either select ``Quit'' from the ``Files'' menu, or use the quit function described below.

At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by pressing ``command-period'', or by selecting the item ``Interrupt Caml Light'' in the ``Caml Light'' menu. This goes back to the # prompt.

PC:
The toplevel system is presented as a Windows application named Camlwin.exe. It should be launched from the Windows file manager or program manager.

The ``Terminal'' windows is split in two panes. Phrases are entered and edited in the bottom pane. The top pane displays a copy of the input phrases as they are processed by the Caml Light toplevel, interspersed with the toplevel responses. The ``Return'' key sends the contents of the bottom pane to the Caml Light toplevel. The ``Enter'' key inserts a newline without sending the contents of the Input window. (This can be configured with the ``Preferences'' menu item.)

The contents of the input window can be edited at all times, with the standard Windows interface. An history of previously entered phrases is maintained and displayed in a separate window.

To quit the Camlwin application, either select ``Quit'' from the ``File'' menu, or use the quit function described below.

At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by selecting the ``Interrupt Caml Light'' menu item. This goes back to the # prompt.

A text-only version of the toplevel system is available under the name caml.exe. It runs under MSDOS as well as under Windows in a DOS window. No editing facilities are provided.

Options

The following command-line options are recognized by the caml or camllight commands.

-g
Start the toplevel system in debugging mode. This mode gives access to values and types that are local to a module, that is, not exported by the interface of the module. When debugging mode is off, these local objects are not accessible (attempts to access them produce an ``Unbound identifier'' error). When debugging mode is on, these objects become visible, just like the objects that are exported in the module interface. In particular, values of abstract types are printed using their concrete representations, and the functions local to a module can be ``traced'' (see the trace function in section 6.2). This applies only to the modules that have been compiled in debugging mode (either by the batch compiler with the -g option, or by the toplevel system in debugging mode), that is, those modules that have an associated .zix file.

-I directory
Add the given directory to the list of directories searched for compiled interface files (.zi) and compiled object code files (.zo). By default, the current directory is searched first, then the standard library directory. Directories added with -I are searched after the current directory, but before the standard library directory. When several directories are added with several -I options on the command line, these directories are searched from right to left (the rightmost directory is searched first, the leftmost is searched last). Directories can also be added to the search path once the toplevel is running with the #directory directive; see chapter 4.

-lang language-code
Translate the toplevel messages to the specified language. The language-code is fr for French, es for Spanish, de for German, ... (See the file camlmsgs.txt in the Caml Light standard library directory for a list of available languages.) When an unknown language is specified, or no translation is available for a message, American English is used by default.

-O module-set
Specify which set of standard modules is to be implicitly ``opened'' when the toplevel starts. There are three module sets currently available:

cautious
provides the standard operations on integers, floating-point numbers, characters, strings, arrays, ..., as well as exception handling, basic input/output, ... Operations from the cautious set perform range and bound checking on string and vector operations, as well as various sanity checks on their arguments.
fast
provides the same operations as the cautious set, but without sanity checks on their arguments. Programs compiled with -O fast are therefore slightly faster, but unsafe.
none
suppresses all automatic opening of modules. Compilation starts in an almost empty environment. This option is not of general use.
The default compilation mode is -O cautious. See chapter 14 for a complete listing of the modules in the cautious and fast sets.

Unix:
The following environment variables are also consulted:

LANG
When set, control which language is used to print the compiler messages (see the -lang command-line option).

LC_CTYPE
If set to iso_8859_1, accented characters (from the ISO Latin-1 character set) in string and character literals are printed as is; otherwise, they are printed as decimal escape sequences (\ddd).

Toplevel control functions

The standard library module toplevel, opened by default when the toplevel system is launched, provides a number of functions that control the toplevel behavior, load files in memory, and trace program execution.

value quit : unit -> unit
Exit the toplevel loop and terminate the camllight command.
value include : string -> unit
Read, compile and execute source phrases from the given file. The .ml extension is automatically added to the file name, if not present. This is textual inclusion: phrases are processed just as if they were typed on standard input. In particular, global identifiers defined by these phrases are entered in the module named top, not in a new module.
value load : string -> unit
Load in memory the source code for a module implementation. Read, compile and execute source phrases from the given file. The .ml extension is automatically added if not present. The load function behaves much like include, except that a new module is created, with name the base name of the source file name. Global identifiers defined in the source file are entered in this module, instead of the top module as in the case of include. For instance, assuming file foo.ml contains the single phrase
           let bar = 1;;

executing load "foo" defines the identifier foo__bar with value 1. Caveat: the loaded module is not automatically opened: the identifier bar does not automatically complete to foo__bar. To achieve this, you must execute the directive #open "foo" afterwards.

value compile : string -> unit
Compile the source code for a module implementation or interface (.ml or .mli file). Compilation proceeds as with the batch compiler, and produces the same results as camlc -c. If the toplevel system is in debugging mode (option -g or function debug_mode below), the compilation is also performed in debugging mode, as when giving the -g option to the batch compiler. The result of the compilation is left in files (.zo, .zi, .zix). The compiled code is not loaded in memory. Use load_object to load a .zo file produced by compile.
value load_object : string -> unit
Load in memory the compiled bytecode contained in the given file. The .zo extension is automatically added to the file name, if not present. The bytecode file has been produced either by the standalone compiler camlc or by the compile function. Global identifiers defined in the file being loaded are entered in their own module, not in the top module, just as with the load function.
value trace : string -> unit
After the execution of trace "foo", all calls to the global function named foo will be ``traced''. That is, the argument and the result are displayed for each call, as well as the exceptions escaping out of foo, either raised by foo itself, or raised by one of the functions called from foo. If foo is a curried function, each argument is printed as it is passed to the function. Only functions implemented in ML can be traced; system primitives such as string_length or user-supplied C functions cannot.
value untrace : string -> unit
Executing untrace "foo" stops all tracing over the global function named foo.
value verbose_mode: bool -> unit
verbose_mode true causes the compile function to print the inferred types and other information. verbose_mode false reverts to the default silent behavior.
value install_printer : string -> unit
install_printer "printername" registers the function named printername as a printer for objects whose types match its argument type. That is, the toplevel loop will call printername when it has such an object to print. The printing function printername must use the format library module to produce its output, otherwise the output of printername will not be correctly located in the values printed by the toplevel loop.
value remove_printer : string -> unit
remove_printer "printername" removes the function named printername from the table of toplevel printers.
value set_print_depth : int -> unit
set_print_depth n limits the printing of values to a maximal depth of n. The parts of values whose depth exceeds n are printed as ... (ellipsis).
value set_print_length : int -> unit
set_print_length n limits the number of value nodes printed to at most n. Remaining parts of values are printed as ... (ellipsis).
value debug_mode: bool -> unit
Set whether extended module interfaces must be used debug_mode true or not debug_mode false. Extended module interfaces are .zix files that describe the actual implementation of a module, including private types and variables. They are generated when compiling with camlc -g, or with the compile function above when debug_mode is true. When debug_mode is true, toplevel phrases can refer to private types and variables of modules, and private functions can be traced with trace. Setting debug_mode true is equivalent to starting the toplevel with the -g option.
value cd : string -> unit
Change the current working directory.
value directory : string -> unit
Add the given directory to the search path for files. Same behavior as the -I option or the #directory directive.

The toplevel and the module system

Toplevel phrases can refer to identifiers defined in modules other than the top module with the same mechanisms as for separately compiled modules: either by using qualified identifiers (modulename__localname), or by using unqualified identifiers that are automatically completed by searching the list of opened modules. (See section 3.3.) The modules opened at startup are given in the documentation for the standard library. Other modules can be opened with the #open directive.

However, before referencing a global variable from a module other than the top module, a definition of that global variable must have been entered in memory. At start-up, the toplevel system contains the definitions for all the identifiers in the standard library. The definitions for user modules can be entered with the load or load_object functions described above. Referencing a global variable for which no definition has been provided by load or load_object results in the error ``Identifier foo__bar is referenced before being defined''. Since this is a tricky point, let us consider some examples.

  1. The library function sub_string is defined in module string. This module is part of the standard library, and is one of the modules automatically opened at start-up. Hence, both phrases
            sub_string "qwerty" 1 3;;
            string__sub_string "qwerty" 1 3;;
    
    are correct, without having to use #open, load, or load_object.

  2. The library function printf is defined in module printf. This module is part of the standard library, but it is not automatically opened at start-up. Hence, the phrase
            printf__printf "%s %s" "hello" "world";;
    
    is correctly executed, while
            printf "%s %s" "hello" "world";;
    
    causes the error ``Variable printf is unbound'', since none of the currently opened modules define a global with local name printf. However,
            #open "printf";;
            printf "%s %s" "hello" "world";;
    
    executes correctly.

  3. Assume the file foo.ml resides in the current directory, and contains the single phrase
            let x = 1;;
    
    When the toplevel starts, references to x will cause the error `` Variable x is unbound''. References to foo__x will cause the error ``Cannot find file foo.zi'', since the typechecker is attempting to load the compiled interface for module foo in order to find the type of x. To load in memory the module foo, just do:
            load "foo";;
    
    Then, references to foo__x typecheck and evaluate correctly. Since load does not open the module it loads, references to x will still fail with the error ``Variable x is unbound''. You will have to do
            #open "foo";;
    
    explicitly, for x to complete automatically into foo__x.

  4. Finally, assume the file foo.ml above has been previously compiled with the camlc -c command. The current directory therefore contains a compiled interface foo.zi, claiming that foo__x is a global variable with type int, and a compiled bytecode file foo.zo, defining foo__x to have the value 1. When the toplevel starts, references to foo__x will cause the error ``foo__x is referenced before being defined''. In contrast with case 3 above, the typechecker has succeeded in finding the compiled interface for module foo. But execution cannot proceed, because no definition for foo__x has been entered in memory. To do so, execute:
            load_object "foo";;
    
    This loads the file foo.zo in memory, therefore defining foo__x. Then, references to foo__x evaluate correctly. As in case 3 above, references to x still fail, because load_object does not open the module it loads. Again, you will have to do
            #open "foo";;
    
    explicitly, for x to complete automatically into foo__x.

Common errors

This section describes and explains the most frequently encountered error messages.

Cannot find file filename
The named file could not be found in the current directory, nor in the directories of the search path.

If filename has the format mod.zi, this means the current phrase references identifiers from module mod, but you have not yet compiled an interface for module mod. Fix: either load the file mod.ml, which will also create in memory the compiled interface for module mod; or use camlc to compile mod.mli or mod.ml, creating the compiled interface mod.zi, before you start the toplevel.

If filename has the format mod.zo, this means you are trying to load with load_object a bytecode object file that does not exist yet. Fix: compile mod.ml with camlc before you start the toplevel. Or, use load instead of load_object to load the source code instead of a compiled object file.

If filename has the format mod.ml, this means load or include could not find the specified source file. Fix: check the spelling of the file name, or write it if it does not exist.

mod__name is referenced before being defined
You have neglected to load in memory an implementation for a module, with load or load_object. This is explained in full detail in section 6.3 above.

Corrupted compiled interface file filename
See section 5.4.

Expression of type t1 cannot be used with type t2
See section 5.4.

The type inferred for the value name, that is, t, contains type variables that cannot be generalized
See section 5.4.

Building custom toplevel systems: camlmktop

The camlmktop command builds Caml Light toplevels that contain user code preloaded at start-up.

Mac:
This command is not available in the Macintosh version.

The camlmktop command takes as argument a set of .zo files, and links them with the object files that implement the Caml Light toplevel. The typical use is:

        camlmktop -o mytoplevel foo.zo bar.zo gee.zo
This creates the bytecode file mytoplevel, containing the Caml Light toplevel system, plus the code from the three .zo files. To run this toplevel, give it as argument to the camllight command:
        camllight mytoplevel
This starts a regular toplevel loop, except that the code from foo.zo, bar.zo and gee.zo is already loaded in memory, just as if you had typed:
        load_object "foo";;
        load_object "bar";;
        load_object "gee";;
on entrance to the toplevel. The modules foo, bar and gee are not opened, though; you still have to do
        #open "foo";;
yourself, if this is what you wish.

Options

The following command-line options are recognized by camlmktop.

-ccopt option
Pass the given option to the C compiler and linker, when linking in ``custom runtime'' mode. See the corresponding option for camlc, in chapter 5.

-custom
Link in ``custom runtime'' mode. See the corresponding option for camlc, in chapter 5.

-g
Add debugging information to the toplevel file produced, which can then be debugged with camldebug (chapter 10).

-I directory
Add the given directory to the list of directories searched for compiled object code files (.zo).

-o exec-file
Specify the name of the toplevel file produced by the linker.

Unix:
The default is camltop.out.

PC:
The default is camltop.exe. The name must have .exe extension.