Fossil

Check-in [b807acf6]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Documentation updates
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1:b807acf62eecfbf884fa10244ea59d78f39ac0a9
User & Date: drh 2007-07-24 12:52:32
Context
2007-07-24
12:54
Merge in the latest SQLite updates. check-in: d8590e09 user: drh tags: trunk
12:52
Documentation updates check-in: b807acf6 user: drh tags: trunk
2007-07-23
20:42
Always do another sync round if any file is received. check-in: 0feed850 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to www/fileformat.html.

1
2
3
4
5
6
7
8
9
10
11
12
13


14
15
16









17


18
19
20















21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54







55
56
57
58
59










60
61
62
63
64
65
66
67
68
69
70
71
72
73
74




75
76
77
78
79
80
81
82
83
84









85
86
87
88
89
90
91
92
93
94











95



























































<html>
<head>
<title>Fossil File Formats</title>
</head>
<body bgcolor="white">
<h1 align="center">
Fossil File Formats
</h1>

<p>
The global state of a fossil repository is determined by an unordered
set of content files.  Each of these files has a format which is defined
by this document.


</p>

<h2>1.0 General Formatting Rules</h2>












<p>
Fossil content files consist of a header, a blank line, and optional
content.















</p>

<p>
The header is divided into "properties" by newline ('\n', 0x0a)
characters.  Each header property is divided into tokens by space (' ', 0x20)
characters.  The first token of each property is the property name.
Subsequent tokens (if any) are arguments to the property.
</p>

<p>
The blank line that separates the header from the content can be
thought of as a property line that contains no tokens.  Everything
that follows the newline character that terminates the blank line
is content.  The blank line is always present but the content is
optional.
</p>

<p>
All tokens in a property line are encoded to escape special characters.
The encoding is as follows:
</p>

<blockquote>
<table border="1">
<tr><th>Input Character</th><th>Encoded As</th></tr>
<tr><td align="center"> space (0x20) </td><td align="center"> \s </td></tr>
<tr><td align="center"> newline (0x0A) </td><td align="center"> \n </td></tr>
<tr><td align="center"> carriage return (0x0D) </td><td align="center"> \r </td></tr>
<tr><td align="center"> tab (0x09) </td><td align="center"> \t </td></tr>
<tr><td align="center"> vertical tab (0x0B) </td><td align="center"> \v </td></tr>
<tr><td align="center"> formfeed (0x0C) </td><td align="center"> \f </td></tr>
<tr><td align="center"> nul (0x00) </td><td align="center"> \0 </td></tr>
<tr><td align="center"> backslash (0x5C) </td><td align="center"> \\ </td></tr>
</table>







</blockquote>

<p>
Characters other than the ones shown in the table above are passed through
the encoder without change.










</p>

<p>
All properties names are unpunctuated lower-case ASCII strings.
The properties appear in the header in sorted order (using
memcpy() as the comparision function) except for the "signature"
property which always occurs first.
</p>

<h2>2.0 Common Properties</h2>

<p>
Every content file has a "time" property.  The argument to the
time property is an integer which is the number of seconds since
1970 UTC when the content file was created.  For example:




</p>

<blockquote>
time 1181404746
</blockquote>

<p>
Every content file has a "type" property.  The argument to the
type property defines the purpose of the content file.  The
argument can be strings like "version", "folder", "file", or "user".









</p>

<p>
The first property of a content file is the digital signature.  The
name of the signature property is "signature".  There are two arguments.
The first argument is the SHA256 hash of the content file that defines
the user who signed this file.  User records themselves are self-signed
and so the first argument is simply "*" for user records.  The second
argument is the digital signature of an SHA256 hash of the entire
file (header and content) except for the signature line itself.











</p>





























































|








|
|
>
>


<
>
>
>
>
>
>
>
>
>

>
>

<
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>



|
|
|
|



|
<
<
<
<


<
<
<
<
<

<
<
<
<
<
<
<
<
<
<
<
>
>
>
>
>
>
>



<
<
>
>
>
>
>
>
>
>
>
>



<
<
<
<
<
<
<
<
<
<
<
<
>
>
>
>



|



|
|
|
>
>
>
>
>
>
>
>
>



<
<
<
<
<
<
<
>
>
>
>
>
>
>
>
>
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

18
19
20
21
22
23
24
25
26
27
28
29
30


31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56




57
58





59











60
61
62
63
64
65
66
67
68
69


70
71
72
73
74
75
76
77
78
79
80
81
82












83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108







109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
<html>
<head>
<title>Fossil File Format</title>
</head>
<body bgcolor="white">
<h1 align="center">
Fossil File Formats
</h1>

<p>
The global state of a fossil repository is determined by an unordered
set of files.  Some files used to represent wiki pages, trouble tickets,
and the special "manifest" file has a specific and well-defined format.
Other files are just the content of the files.  Files can be text or
binary.
</p>


<p>
Each file in the repository is named by its SHA1 hash.
Some files have a particular format which qualifies them
as "manifests".  A manifest assigns filenames to a subset
of the files in the repository, in order to provide a
snapshot of the state of the project at a point in time.
Each manifest file corresponds to a version or baseline
of the project.
</p>

<h2>1.0 The Manifest File</h2>

<p>


Any file in the repository that follows the syntactic rules
of a manifest is a manifest.  Note that a manifest can
be both a real manifest and also a content file, though this
is rare.
</p>

<p>
A manifest is a line-oriented text file.  Newline characters
(ASCII 0x0a) separate lines.  Each line begins with a single
character "line type".  Zero or more arguments may follow
the line type.  All arguments are separated from each other
and from the line-type character by a single space
character.  There is no surplus white space between arguments
and no leading or trailing whitespace except for the newline 
character that acts as the line separator.
</p>

<p>
All lines of the manifest occur in strict sorted lexigraphical order.
No line may be duplicated.
The entire manifest file may be PGP clear-signed, but otherwise it
may contain no additional text or data beyond what is described here.
</p>

<p>
Allowed lines in the manifest are as follows:




</p>






<blockquote>











<b>C</b> <i>checkin-comment</i><br>
<b>D</b> <i>time-and-date-stamp</i><br>
<b>F</b> <i>filename</i> <i>SHA1-hash</i><br>
<b>P</b> <i>SHA1-hash</i>+<br>
<b>R</b> <i>repository-checksum</i><br>
<b>U</b> <i>user-login</i><br>
<b>Z</b> <i>manifest-checksum</i>
</blockquote>

<p>


A manifest must have exactly one C-line.  The sole argument to
the C-line is a check-in comment that describes the baseline that
the manifest defines.  The check-in comment is text.  The following
escape sequences are applied to the text:
A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73).  A
newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E).  A backslash 
(ASCII 0x5C) is represented as two backslashes "\\".  Apart from
space and newline, no other whitespace characters are allowed in
the check-in comment.  Nor are any unprintable characters allowed
in the comment.
</p>

<p>












A manifest must have exactly one D-line.  The sole argument to
the D-line is a date-time stamp in the ISO8601 format.  The
date and time should be in coordinated universal time (UTC).
The format is:
</p>

<blockquote>
<i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i>
</blockquote>

<p>
A manifest has zero or more F-lines.  Each F-line defines a file
(other than the manifest itself) which is part of the baseline that
the manifest defines.  There are two arguments.  The first argment
is the pathname of the file in the baseline relative to the root
of the project file hierarchy.  No ".." or "." directories are allowed
within the filename.  Space characters are escaped as in C-line
comment text.  Backslash characters and newlines are not allowed
within filenames.  The directory separator character is a forward
slash (ASCII 0x2F).  The second argument to the F-line is the
full 40-character hexadecimal SHA1 hash of the file content.  
Upper-case letters ABCDEF are used for the higher digits of the
hexadecimal.
</p>

<p>







A manifest has zero or one P-lines.  Most manifests have one P-line.
The P-line has a varying number of arguments that
defines other manifests from which the current manifest
is derived.  Each argument is an 40-character uppercase 
hexadecimal SHA1 of the predecessor manifest.  All arguments
to the P-line must be unique to that line.
The first predecessor is the manifests direct ancestor.
Other arguments define manifests with which the first was
merged to yield the current manifest.  Most manifests have
a P-line with a single argument.  The first manifest in the
project has no ancestors and thus has no P-line.
</p>

<p>
A manifest may optionally have a single R-line.  The R-line has
a single argument which is the MD5 checksum of all files in 
the baseline except the manifest itself.  The checksum is expressed
as 32-characters of uppercase hexadecimal.   The checksum is
computed as follows:  For each file in the baseline (except for
the manifest itself) in strict sorted lexigraphical order, 
take the pathname of the file relative to the root of the
repository, append a single space (ASCII 0x20), the
size of the file in ASCII decimal, a single newline
character (ASCII 0x0A), and the complete text of the file.
Compute the MD5 checksum of the the result.
</p>

<p>
Each manifest has a single U-line.  The argument to the U-line is
the login of the user who created the manifest.  The login name
is encoded using the same character escapes as is used for the
check-in comment argument to the C-line.
</p>

<p>
A manifest has an option Z-line as its last line.  The argument
to the Z-line is a 32-character uppercase hexadecimal MD5 hash
of all prior lines of the manifest up to and including the newline 
character that immediately preceeds the "Z".  The Z-line is just
a sanity check to prove that the manifest is well-formed and
consistent.
</p>

<h2>2.0 Trouble Tickets</h2>

<p>
Each trouble ticket is a file in the repository and appears in
a manifest for every baseline in which the ticket exists.
Trouble tickets occur in a specific subdirectory of the file
heirarchy.  The name of the subdirectory that contains tickets
is part of the local state of each repository.  The filename
of each trouble ticket has a ".tkt" suffix.  The trouble ticket
has a particular file format defined below.
</p>

<i>To be continued...</i>

<h2>3.0 Wiki Pages</h2>

<p>
Each wiki is a file in the repository and appears in
a manifest for every baseline in which that wiki page exists.
Wiki pages occur in a specific subdirectory of the file
heirarchy.  The name of the subdirectory that contains wiki pages
is part of the local state of each repository.  The filename
of each wiki page has a ".wiki" suffix.  The base name of
the file is the name of the wiki page.  The wiki pages
have a particular file format defined below.
</p>

<i>To be continued...</i>

Changes to www/index.html.

5
6
7
8
9
10
11

12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
..
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55


56
57
58
59
60
61
62
63
64
65
66
67
68
..
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
<body bgcolor="white">
<h1>Fossil - A Software Configuration Management System</h1>

<p>
This is a preliminary homepage for a new software configuration
management system called "Fossil".
The code is currently under development, and has been for about

a year.  Nothing is available for download or inspection 
as of this writing (2007-06-09).
But the system is self-hosting now.
Hopefully something will be available soon.
</p>

<p>Distinctive features of Fossil:</p>

<ul>
<li>Supports disconnected, distributed development (like
<a href="http://kerneltrap.org/node/4982">git</a>,
<a href="http://www.venge.net/monotone/">monotone</a>,
<a href="http://www.selenic.com/mercurial/wiki/index.cgi">mercurial</a>, or
<a href="http://www.bitkeeper.com/">bitkeeper</a>)
or tightly coupled client/server operation (like 
<a href="http://www.nongnu.org/cvs/">CVS</a> or
<a href="http://subversion.tigris.org/">subversion</a>)
or both at the same time</li>
<li>Integrated bug tracking and wiki, along the lines of
<a href="http://www.cvstrac.org/">CVSTrac</a> and
<a href="http://www.edgewall.com/trac/">Trac</a>.</li>
<li>Built-in web interface that supports deep archaeological digs through
................................................................................
from behind restrictive firewalls).</li>
<li>Everything included in a single self-contained executable -
    trivial to install</li>
<li>Server runs as <a href="http://www.w3.org/CGI/">CGI</a>, using
<a href="http://en.wikipedia.org/wiki/inetd">inetd</a> or
<a href="http://www.xinetd.org/">xinetd</a> or using its own built-in,
standalone web server.</li>
<li>The entire project contained in single disk file (which also
happens to be an <a href="http://www.sqlite.org/">SQLite</a> database.)</li>
<li>Self sign-up (at the administrators discretion) including the
ability to support secure anonymous check-ins (also optional).</li>
<li>Digital signatures on all files, versions, 
<a href="http://wiki.org/wiki.cgi?WhatIsWiki">wiki</a> pages,
trouble tickets, etc.  Everything is digitally signed.</li>
<li>Trivial to setup and administer</li>
<li>Files and versions identified by their
<a href="http://en.wikipedia.org/wiki/SHA-1">SHA-256</a> signature expressed
in <a href="base32.html">base-32 notation</a>.
Any unique prefix is sufficient to identify a file
or version - usually the first 4 or 5 characters suffice.</li>


<li>Automatic <a href="selfcheck.html">self-check</a>
on repository changes makes it exceedingly
unlikely that data will ever be lost because of a software bug.</li>
</ul>

<p>Goals of fossil:</p>

<ul>
<li>Fossil should be ridiculously easy to install and operate.</li>
<li>With fossil, it should be possible (and easy) to set up a project
on an inexpensive shared-hosting ISP
(example: <a href="http://www.he.net/hosting.html">Hurricane Electric</a>)
that provides nothing more than web space and CGI capability.</li>
................................................................................
"the wisdom of crowds" by opening up write access to the masses.</li>
</ul>

<p>Links:</p>

<ul>
<li><a href="pop.html">Principals Of Operation</a></li>
<li>The <a href="base32.html">base-32 encoding</a> mechanism used
by Fossil.</li>
<li>The <a href="fileformat.html">file format</a> used by every content
file stored in the repository.</li>
</ul>

</body>
</html>







>
|
|




|







|







 







|

<
<
<
<
<


|
<


>
>





|







 







|
|






5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
..
37
38
39
40
41
42
43
44
45





46
47
48

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
..
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
<body bgcolor="white">
<h1>Fossil - A Software Configuration Management System</h1>

<p>
This is a preliminary homepage for a new software configuration
management system called "Fossil".
The code is currently under development, and has been for about
two years.  (We have iterated the design multiple times.)
Nothing is available for download or inspection 
as of this writing (2007-07-24).
But the system is self-hosting now.
Hopefully something will be available soon.
</p>

<p>Design Goals For Fossil:</p>

<ul>
<li>Supports disconnected, distributed development (like
<a href="http://kerneltrap.org/node/4982">git</a>,
<a href="http://www.venge.net/monotone/">monotone</a>,
<a href="http://www.selenic.com/mercurial/wiki/index.cgi">mercurial</a>, or
<a href="http://www.bitkeeper.com/">bitkeeper</a>)
or client/server operation (like 
<a href="http://www.nongnu.org/cvs/">CVS</a> or
<a href="http://subversion.tigris.org/">subversion</a>)
or both at the same time</li>
<li>Integrated bug tracking and wiki, along the lines of
<a href="http://www.cvstrac.org/">CVSTrac</a> and
<a href="http://www.edgewall.com/trac/">Trac</a>.</li>
<li>Built-in web interface that supports deep archaeological digs through
................................................................................
from behind restrictive firewalls).</li>
<li>Everything included in a single self-contained executable -
    trivial to install</li>
<li>Server runs as <a href="http://www.w3.org/CGI/">CGI</a>, using
<a href="http://en.wikipedia.org/wiki/inetd">inetd</a> or
<a href="http://www.xinetd.org/">xinetd</a> or using its own built-in,
standalone web server.</li>
<li>An entire project contained in single disk file (which also
happens to be an <a href="http://www.sqlite.org/">SQLite</a> database.)</li>





<li>Trivial to setup and administer</li>
<li>Files and versions identified by their
<a href="http://en.wikipedia.org/wiki/SHA-1">SHA1</a> signature.</a>

Any unique prefix is sufficient to identify a file
or version - usually the first 4 or 5 characters suffice.</li>
<li>The file format is trival and requires nothing more complex
than a text editor and the "sha1sum" command-line utility to decode.</li>
<li>Automatic <a href="selfcheck.html">self-check</a>
on repository changes makes it exceedingly
unlikely that data will ever be lost because of a software bug.</li>
</ul>

<p>Objectives Of Fossil:</p>

<ul>
<li>Fossil should be ridiculously easy to install and operate.</li>
<li>With fossil, it should be possible (and easy) to set up a project
on an inexpensive shared-hosting ISP
(example: <a href="http://www.he.net/hosting.html">Hurricane Electric</a>)
that provides nothing more than web space and CGI capability.</li>
................................................................................
"the wisdom of crowds" by opening up write access to the masses.</li>
</ul>

<p>Links:</p>

<ul>
<li><a href="pop.html">Principals Of Operation</a></li>
<li>The <a href="selfcheck.html">automatic self-check</a> mechanism
helps insure project integrity.</li>
<li>The <a href="fileformat.html">file format</a> used by every content
file stored in the repository.</li>
</ul>

</body>
</html>

Changes to www/pop.html.

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

72
73
74
75
76

77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
has the potential to be shared in common when the
repositories are fully synchronized).  The local state
for each repository is private to that repository.
The global state represents the content of the project.
The local state identifies the authorized users and
access policies for a particular repository.</p></li>

<li><p>The global state of a repository is an mostly unordered
collection of files.  Each file is named by 
its SHA256 hash.  The name is encoded as a 52-digit 
base-32 number.  In many contexts, the name can be
abbreviated to a unique prefix.  A five- or six-character
prefix usually suffices to uniquely identify a file.</p></li>

<li><p>Because files are named by their SHA256 hash, all files
are immutable.  Any change to the content of a file also 
changes the hash that forms the files name, thus
creating a new file.  Both the old original version of the
file and the new change are preserved under different names.</p></li>

<li><p>It is theoretically possible for two files with different
content to share the same hash.  But finding two such
files is so incredibly difficult and unlikely that we
consider it to be an impossibility.</p></li>

<li><p>The files that comprise the global state of a repository
consist of a header followed by optional content.  Every
file contains an RSA signature in the header.  And every 
file contains a "file type" designator in the header.
Additional information is also found in the header depending
on the file type.</p></li>

<li><p>The file that comprise the global state of a repository
are the complete global state of that repository.  The SQLite
database that holds the repository contains additional information
about linkages between files, but all of that added information
can be discarded and reconstructed by scanning the content
files.</p></li>

<li><p>Two repositories for the same project can synchronize
their global states simply by sharing files.  The local
state of repositories is not normally synchronized or
shared.</p></li>

<li><p>The name of a file is its SHA256 hash in a base-32 
encoding.  The digits of the base-32 encode are as 
follows:


<blockquote><b>
    0123456789abcdefghjkmnpqrstuvwxy
</b></blockquote>


<p>The letters "o", "i", and "l" are omitted from the
encoding character set to avoid confusion with the 
digits "0" and "1".  On input, upper and lower case 
letters are treated the same, the letter "o" is 
interpreted as a zero ("0") and the letters "i" and 
"l" are interpreted as a one ("1").  The full name of 
a file is 52 characters long.  The first 4 bits of the
SHA256 has are repeated onto the end of the hash so that
the last digit in the base-32 encoding will contain a
full 5 bits.
For convenience, files
may often be abbreviated to a unique prefix and the 
repository will automatically expand the name to
its full 52 characters.  In practice, 5 or 6
characters are usually sufficient to give a unique 
name prefix to files even in the largest of projects.</p></li>
</ul>

</body>
</html>







|

|
|



|










|
|
|
|
|
|

|



|







|
|
|
>

<
|
|
<
>
|
|
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

74
75

76
77
78















79
80
81
has the potential to be shared in common when the
repositories are fully synchronized).  The local state
for each repository is private to that repository.
The global state represents the content of the project.
The local state identifies the authorized users and
access policies for a particular repository.</p></li>

<li><p>The global state of a repository is an unordered
collection of files.  Each file is named by 
its SHA1 hash encoded in hexadecimal.
In many contexts, the name can be
abbreviated to a unique prefix.  A five- or six-character
prefix usually suffices to uniquely identify a file.</p></li>

<li><p>Because files are named by their SHA1 hash, all files
are immutable.  Any change to the content of a file also 
changes the hash that forms the files name, thus
creating a new file.  Both the old original version of the
file and the new change are preserved under different names.</p></li>

<li><p>It is theoretically possible for two files with different
content to share the same hash.  But finding two such
files is so incredibly difficult and unlikely that we
consider it to be an impossibility.</p></li>

<li><p>The signature of a file is the SHA1 hash of the 
file itself, exactly as it appears on disk.  No prefix
or meta-information about the file is added before computing
the hash.  So you can
always find the SHA1 signature of a file by using the
"sha1sum" command-line utility.</p></li>

<li><p>The files that comprise the global state of a repository
are the complete global state of that repository.  The SQLite
database that holds the repository contains additional information
about linkages between files, but all of that added information
can be discarded and reconstructed by rescanning the content
files.</p></li>

<li><p>Two repositories for the same project can synchronize
their global states simply by sharing files.  The local
state of repositories is not normally synchronized or
shared.</p></li>

<li><p>Every repository has a special file at the top-level
named "manifest" which is an index of all other files in
the system.  The manifest is automatically created and
maintained by the system.</p></li>


<li><p>The <a href="fileformat.html">file format</a>
is very simple so that with access

to the original content files, one can easily reconstruct
the content of a baseline without the need for any
special tools or software.</p></li>
















</body>
</html>

Changes to www/selfcheck.html.

15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
..
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88


89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
fossil uses to help prevent file loss due to bugs.
</p>

<h2>Atomic Check-ins With Rollback</h2>

<p>
The fossil repository is an
<a href="http://www.sqlite.org/">SQLite</a> database file.  SQLite
is very mature and stable and has been in wide-spread use for many
years, so we have little worries that it might cause repository
corruption.  SQLite
databases do not corrupt even if a program or system crash or power
failure occurs in the middle of the update.  If some kind of crash
does occur in the middle of a change, then all the changes are rolled
back the next time that the database is accessed.
</p>
................................................................................
To increase our confidence that everything in the repository is
recoverable, fossil makes sure it can extract an exact replicate
of every content file that it changes just prior to transaction
commit.  So during the course of check-in, many different files
in the repository might be modified.  Some files are simply
compressed.  Other files are delta encoded and then compressed.
While all this is going on, fossil makes a record of every file
that is encoded and the MD5 hash of the original content of that
file.  Then just before transaction commit, fossil re-extracts
the original content of all files that were written, computes
the MD5 checksum again, and verifies that the checksums match.
If anything does not match up, an error
message is printed and the transaction rolls back.
</p>

<p>
So, in other words, fossil always checks to make sure it can
re-extract a file before it commits a check-in of that file.
Hence bugs in fossil are unlikely to corrupt the repository in
a way that prevents us from extracting historical versions of 
files.
</p>

<h2>Checksums on all files and versions</h2>

<p>
Repository records of type "file" (records that hold the content
of project files) contain a "cksum" property which records the
MD5 checksum of the content of that file.  So if something goes
wrong in the file extraction process we will at least know about
it.  This checksum is in addition to the digital signature that
is over the entire header and content of the record.
</p>

<p>


Repository records of type "version" contain a "cksum"
property that holds the MD5 checksum of the concatenation of
every file in the entire project.  During a check-in, after
fossil has inserted all changes into the repository, it goes
back and rereads every file out of the repository and recomputes
this global checksum based on the respository content.  It then
computes an MD5 checksum over the files on disk.  If these two
checksums do not match, the check-in files and rolls back.
Thus if a check-in transaction is successful, we have high
confidence that the content in the repository exactly matches
the content on disk.
</p>

<p>
Every project files is verified by three separate checksums.
There is an SHA256 checksum used as part of the digital signature
on the file.  There is an MD5 checksum on the content of each
individual file.  And there is a global MD5 checksum over the
entire project source tree.  If any of these cross-checks do not
match then the operation fails and an error is displayed.  Taken
together, these cross-checks give us high confidence that the
files you checked out are identical to the files you checked in.
</p>







|
|







 







|


|












|


|
|
|
|
|
|
|
<
<
>
>
|
<
<
<
<
<
<
<
<
<
<

<
<
<
<
<
<
<
<
<
<
<
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
..
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86


87
88
89










90











fossil uses to help prevent file loss due to bugs.
</p>

<h2>Atomic Check-ins With Rollback</h2>

<p>
The fossil repository is an
<a href="http://www.sqlite.org/">SQLite version 3</a> database file.  
SQLite is very mature and stable and has been in wide-spread use for many
years, so we have little worries that it might cause repository
corruption.  SQLite
databases do not corrupt even if a program or system crash or power
failure occurs in the middle of the update.  If some kind of crash
does occur in the middle of a change, then all the changes are rolled
back the next time that the database is accessed.
</p>
................................................................................
To increase our confidence that everything in the repository is
recoverable, fossil makes sure it can extract an exact replicate
of every content file that it changes just prior to transaction
commit.  So during the course of check-in, many different files
in the repository might be modified.  Some files are simply
compressed.  Other files are delta encoded and then compressed.
While all this is going on, fossil makes a record of every file
that is encoded and the SHA1 hash of the original content of that
file.  Then just before transaction commit, fossil re-extracts
the original content of all files that were written, computes
the SHA1 checksum again, and verifies that the checksums match.
If anything does not match up, an error
message is printed and the transaction rolls back.
</p>

<p>
So, in other words, fossil always checks to make sure it can
re-extract a file before it commits a check-in of that file.
Hence bugs in fossil are unlikely to corrupt the repository in
a way that prevents us from extracting historical versions of 
files.
</p>

<h2>Checksum Over All Files In A Baseline</h2>

<p>
Manifest files that define a baseline have two fields (the
R-line and Z-line) that record MD5 hashs of the manifest itself
and of all other files in the manifest.  Prior to any check-in
commit, these checksums are verified to ensure that the baseline
checked in agrees exactly with what is on disk.  Similarly,
the repository checksum is verified after a checkout to make
sure that the entire repository was checked out correctly.


Note that these added checks use a different hash (MD5 instead
of SHA1) in order to avoid common-mode failures in the hash
algorithm implementation.










</p>