This is an abstract class. It defines most methods implemented by its child classes, like DiskFile, MemoryFile and PipeFile.
Methods defined here are intended for basic read/write functionalities. Read/write methods might write in ASCII mode or binary mode.
In ASCII mode, numbers are converted in human readable
format (characters). Booleans are converted into 0
(false) or 1
(true).
In binary mode, numbers and boolean are directly encoded
as represented in a register of the computer. While not being human
readable and less portable, the binary mode is obviously faster.
In ASCII mode, if the default option autoSpacing() is chosen, a space will be generated after each written number or boolean. A carriage return will also be added after each call to a write method. With this option, the spaces are supposed to exist while reading. This option can be deactivated with noAutoSpacing().
A Lua
error might or might not be generated in case of read/write error
or problem in the file. This depends on the choice made between
quiet() and pedantic() options. It
is possible to query if an error occurred in the last operation by calling
hasError().
They are three types of reading methods:
[number] readTYPE()
[TYPEStorage] readTYPE(n)
[number] readTYPE(TYPEStorage)
where TYPE
can be either Byte
, Char
, Short
, Int
, Long
, Float
or Double
.
A convenience method also exist for boolean types: [boolean] readBool()
. It reads
a value on the file with readInt()
and returns true
if and only if this value is 1
. It is not possible
to read storages of booleans.
All these methods depends on the encoding choice: ASCII or binary mode. In ASCII mode, the option autoSpacing() and noAutoSpacing() have also an effect on these methods.
If no parameter is given, one element is returned. This element is
converted to a Lua
number when reading.
If n
is given, n
values of the specified type are read
and returned in a new Storage of that particular type.
The storage size corresponds to the number of elements actually read.
If a Storage
is given, the method will attempt to read a number of elements
equals to the size of the given storage, and fill up the storage with these elements.
The number of elements actually read is returned.
In case of read error, these methods will call the Lua
error function using the default
pedantic option, or stay quiet with the quiet
option. In the latter case, one can check if an error occurred with
hasError().
They are two types of writing methods:
[number] writeTYPE(number)
[number] writeTYPE(TYPEStorage)
where TYPE
can be either Byte
, Char
, Short
, Int
, Long
, Float
or Double
.
A convenience method also exist for boolean types: writeBool(value)
. If value
is nil
or
not true
a it is equivalent to a writeInt(0)
call, else to writeInt(1)
. It is not possible
to write storages of booleans.
All these methods depends on the encoding choice: ASCII or binary mode. In ASCII mode, the option autoSpacing() and noAutoSpacing() have also an effect on these methods.
If one Lua
number is given, this number is converted according to the
name of the method when writing (e.g. writeInt(3.14)
will write 3
).
If a Storage
is given, the method will attempt to write all the elements contained
in the storage.
These methods return the number of elements actually written.
In case of write error, these methods will call the Lua
error function using the default
pedantic option, or stay quiet with the quiet
option. In the latter case, one can check if an error occurred with
hasError().
These methods allow the user to save any serializable objects on disk and
reload it later in its original state. In other words, it can perform a
deep copy of an object into a given File
.
Serializable objects are Torch
objects having a read()
and
write()
method. Lua
objects such as table
, number
or
string
or pure Lua functions are also serializable.
If the object to save contains several other objects (let say it is a tree of objects), then objects appearing several times in this tree will be saved only once. This saves disk space, speeds up loading/saving and respects the dependencies between objects.
Interestingly, if the File
is a MemoryFile, it allows
the user to easily make a clone of any serializable object:
file = torch.MemoryFile() -- creates a file in memory
file:writeObject(object) -- writes the object into file
file:seek(1) -- comes back at the beginning of the file
objectClone = file:readObject() -- gets a clone of object
Returns the next serializable object saved beforehand in the file with writeObject().
Note that objects which were written with the same reference have still the same reference after loading.
Example:
-- creates an array which contains twice the same tensor
array = {}
x = torch.Tensor(1)
table.insert(array, x)
table.insert(array, x)
-- array[1] and array[2] refer to the same address
-- x[1] == array[1][1] == array[2][1] == 3.14
array[1][1] = 3.14
-- write the array on disk
file = torch.DiskFile('foo.asc', 'w')
file:writeObject(array)
file:close() -- make sure the data is written
-- reload the array
file = torch.DiskFile('foo.asc', 'r')
arrayNew = file:readObject()
-- arrayNew[1] and arrayNew[2] refer to the same address!
-- arrayNew[1][1] == arrayNew[2][1] == 3.14
-- so if we do now:
arrayNew[1][1] = 2.72
-- arrayNew[1][1] == arrayNew[2][1] == 2.72 !
Writes object
into the file. This object can be read later using
readObject(). Serializable objects are Torch
objects having a read()
and write()
method. Lua
objects such as
table
, number
or string
or pure Lua functions are also serializable.
If the object has been already written in the file, only a reference to this already saved object will be written: this saves space an speed-up writing; it also allows to keep the dependencies between objects intact.
In returns, if one writes an object, modifies its member, and writes the object again in the same file, the modifications will not be recorded in the file, as only a reference to the original will be written. See readObject() for an example.
If format
starts with "*l"
then returns the next line in the File
. The end-of-line character is skipped.
If format
starts with "*a"
then returns all the remaining contents of the File
.
If no data is available, then an error is raised, except if File
is in quiet() mode where
it then returns an empty string ''
and after that you'll be able to see that last reading failed due to end of file with your_file:hasError().
Because Torch is more precise on number typing, the Lua
format "*n"
is not supported:
instead use one of the number read methods.
Writes the string str
in the File
. If the string cannot be written completely an error is raised, except
if File
is in quiet() mode where it returns the number of character actually written.
The data read or written will be in ASCII
mode: all numbers are converted
to characters (human readable format) and boolean are converted to 0
(false) or 1
(true). The input-output format in this mode depends on the
options autoSpacing() and
noAutoSpacing().
In ASCII mode, write additional spaces around the elements written on disk: if writing a Storage, a space will be generated between each element and a return line after the last element. If only writing one element, a return line will be generated after this element.
Those spaces are supposed to exist while reading in this mode.
This is the default behavior. You can de-activate this option with the noAutoSpacing() method.
The data read or written will be in binary mode: the representation in the
File
is the same that the one in the computer memory/register (not human
readable). This mode is faster than ASCII but less
portable.
Clear the error.flag returned by hasError().
Close the file. Any subsequent operation will generate a Lua
error.
In ASCII mode, do not put extra spaces between element written on disk. This is the contrary of the option autoSpacing().
If the child class bufferize the data while writing, ensure that the data is actually written.
If this mode is chosen (which is the default), a Lua
error will be
generated in case of error (which will cause the program to stop).
It is possible to use quiet() to avoid Lua
error generation
and set a flag instead.
Returns the current position (in bytes) in the file.
The first position is 1
(following Lua standard indexing).
If this mode is chosen instead of pedantic(), no Lua
error will be generated in case of read/write error. Instead, a flag will
be raised, readable through hasError(). This flag can
be cleared with clearError()
Checking if a file is quiet can be performed using isQuiet().
Jump into the file at the given position
(in byte). Might generate/raise
an error in case of problem. The first position is 1
(following Lua standard indexing).
Jump at the end of the file. Might generate/raise an error in case of problem.
These methods allow the user to query the state of the given File
.
Returns if an error occurred since the last clearError() call, or since
the opening of the file if clearError()
has never been called.
Returns a boolean which tells if the file is in quiet mode or not.
Tells if one can read the file or not.
Tells if one can write in the file or not.
Return true
if autoSpacing has been chosen.
Sets the referenced property of the File to ref
. ref
has to be true
or false
.
By default ref
is true, which means that a File object keeps track of
objects written (using writeObject method) or
read (using readObject method). Objects with the
same address will be written or read only once, meaning that this approach
preserves shared memory structured.
Keeping track of references has a cost: every object which is serialized in the file is kept alive (even if one discards the object after writing/reading) as File needs to track their pointer. This is not always a desirable behavior, especially when dealing with large data structures.
Another typical example when does not want reference tracking is when
one needs to push the same tensor repeatedly into a file but every time
changing its contents: calling referenced(false)
ensures desired
behaviour.
Returns the state set by referenced.