Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Documentation updates - remove references to SHA1 as the only available hash algorithm. |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
796db898c7c5b2ccedf5bf3c36912ac5 |
User & Date: | drh 2017-03-02 00:10:41 |
Context
2017-03-02
| ||
01:28 | Fix harmless compiler warnings in the new hardened SHA1 implementation. check-in: ed30b3d6 user: drh tags: trunk | |
00:10 | Documentation updates - remove references to SHA1 as the only available hash algorithm. check-in: 796db898 user: drh tags: trunk | |
2017-03-01
| ||
22:26 | Fix a redundant #include in sha1hard.c. check-in: 0f5dc21e user: drh tags: trunk | |
Changes
Changes to www/branching.wiki.
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
...
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
|
<tr><td align="center"> <img src="branch01.gif" width=280 height=68><br> Figure 1 </td></tr></table> Each circle represents a check-in. For the sake of clarity, the check-ins are given small consecutive numbers. In a real system, of course, the check-in numbers would be 40-character SHA1 hashes since it is not possible to allocate collision-free sequential numbers in a distributed system. But as sequential numbers are easier to read, we will substitute them for the 40-character SHA1 hashes in this document. The arrows in figure 1 show the evolution of a project. The initial check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2 was created by making edits to check-in 1 and then committing those edits. We say that 2 is a <i>child</i> of 1 and that 1 is a <i>parent</i> of 2. Check-in 3 is derived from check-in 2, making ................................................................................ The initial check-in of every repository has two propagating tags. In figure 5, that initial check-in is check-in 1. The <b>branch</b> tag tells (by its value) what branch the check-in is a member of. The default branch is called "trunk." All tags that begin with "<b>sym-</b>" are symbolic name tags. When a symbolic name tag is attached to a check-in, that allows you to refer to that check-in by its symbolic name rather than by its 40-character SHA1 hash name. When a symbolic name tag propagates (as does the <b>sym-trunk</b> tag) then referring to that name is the same as referring to the most recent check-in with that name. Thus the two tags on check-in 1 cause all descendants to be in the "trunk" branch and to have the symbolic name "trunk." Check-in 4 has a <b>branch</b> tag which changes the name of the branch to "test." The branch tag on check-in 4 propagates to check-ins 6 and 9. |
|
|
|
|
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
...
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
|
<tr><td align="center"> <img src="branch01.gif" width=280 height=68><br> Figure 1 </td></tr></table> Each circle represents a check-in. For the sake of clarity, the check-ins are given small consecutive numbers. In a real system, of course, the check-in numbers would be long hexadecimal hashes since it is not possible to allocate collision-free sequential numbers in a distributed system. But as sequential numbers are easier to read, we will substitute them for the long hashes in this document. The arrows in figure 1 show the evolution of a project. The initial check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2 was created by making edits to check-in 1 and then committing those edits. We say that 2 is a <i>child</i> of 1 and that 1 is a <i>parent</i> of 2. Check-in 3 is derived from check-in 2, making ................................................................................ The initial check-in of every repository has two propagating tags. In figure 5, that initial check-in is check-in 1. The <b>branch</b> tag tells (by its value) what branch the check-in is a member of. The default branch is called "trunk." All tags that begin with "<b>sym-</b>" are symbolic name tags. When a symbolic name tag is attached to a check-in, that allows you to refer to that check-in by its symbolic name rather than by its hexadecimal hash name. When a symbolic name tag propagates (as does the <b>sym-trunk</b> tag) then referring to that name is the same as referring to the most recent check-in with that name. Thus the two tags on check-in 1 cause all descendants to be in the "trunk" branch and to have the symbolic name "trunk." Check-in 4 has a <b>branch</b> tag which changes the name of the branch to "test." The branch tag on check-in 4 propagates to check-ins 6 and 9. |
Changes to www/checkin_names.wiki.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
..
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
|
<table align="right" border="1" width="33%" cellpadding="10"> <tr><td> <h3>Executive Summary</h3> <p>A check-in can be identified using any of the following names: <ul> <li> SHA1 hash prefix <li> Tag or branchname <li> Timestamp: <i>YYYY-MM-DD HH:MM:SS</i> <li> <i>tag-name</i> <big><b>:</b></big> <i>timestamp</i> <li> <b>root :</b> <i>branchname</i> <li> Special names: <ul> <li> <b>tip</b> ................................................................................ determines which version of the documentation to display. Fossil provides a variety of ways to specify a check-in. This document describes the various methods. <h2>Canonical Check-in Name</h2> The canonical name of a check-in is the SHA1 hash of its [./fileformat.wiki#manifest | manifest] expressed as a 40-character lowercase hexadecimal number. For example: <blockquote><pre> fossil info e5a734a19a9826973e1d073b49dc2a16aa2308f9 </pre></blockquote> The full 40-character SHA1 hash is unwieldy to remember and type, though, so Fossil also accepts a unique prefix of the hash, using any combination of upper and lower case letters, as long as the prefix is at least 4 characters long. Hence the following commands all accomplish the same thing as the above: <blockquote><pre> fossil info e5a734a19a9 |
|
|
|
|
|
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
..
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
|
<table align="right" border="1" width="33%" cellpadding="10"> <tr><td> <h3>Executive Summary</h3> <p>A check-in can be identified using any of the following names: <ul> <li> Cryptographic hash prefix <li> Tag or branchname <li> Timestamp: <i>YYYY-MM-DD HH:MM:SS</i> <li> <i>tag-name</i> <big><b>:</b></big> <i>timestamp</i> <li> <b>root :</b> <i>branchname</i> <li> Special names: <ul> <li> <b>tip</b> ................................................................................ determines which version of the documentation to display. Fossil provides a variety of ways to specify a check-in. This document describes the various methods. <h2>Canonical Check-in Name</h2> The canonical name of a check-in is the hash of its [./fileformat.wiki#manifest | manifest] expressed as a 40-or-more character lowercase hexadecimal number. For example: <blockquote><pre> fossil info e5a734a19a9826973e1d073b49dc2a16aa2308f9 </pre></blockquote> The full 40+ character hash is unwieldy to remember and type, though, so Fossil also accepts a unique prefix of the hash, using any combination of upper and lower case letters, as long as the prefix is at least 4 characters long. Hence the following commands all accomplish the same thing as the above: <blockquote><pre> fossil info e5a734a19a9 |
Changes to www/customskin.md.
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
without the leading "/" and without query parameters.
Examples: "timeline", "doc/trunk/README.txt", "wiki".
* **csrf_token** - A token used to prevent cross-site request forgery.
* **release_version** - The release version of Fossil. Ex: "1.31"
* **manifest_version** - A prefix on the SHA1 check-in hash of the
specific version of fossil that is running. Ex: "\[47bb6432a1\]"
* **manifest_date** - The date of the source-code check-in for the
version of fossil that is running.
* **compiler_name** - The name and version of the compiler used to
build the fossil executable.
|
| |
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
without the leading "/" and without query parameters. Examples: "timeline", "doc/trunk/README.txt", "wiki". * **csrf_token** - A token used to prevent cross-site request forgery. * **release_version** - The release version of Fossil. Ex: "1.31" * **manifest_version** - A prefix on the check-in hash of the specific version of fossil that is running. Ex: "\[47bb6432a1\]" * **manifest_date** - The date of the source-code check-in for the version of fossil that is running. * **compiler_name** - The name and version of the compiler used to build the fossil executable. |
Changes to www/fileformat.wiki.
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
..
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
|
<h2>1.0 Artifact Names</h2> Each artifact in the repository is named by a hash of its content. No prefixes, suffixes, or other information is added to an artifact before the hash is computed. The artifact name is just the (lower-case hexadecimal) hash of the raw artifact. Fossil supports multiple hash algorithms including SHA1 and various lengths of SHA3. Because an artifact can be hashed using multiple algorithms, a single artifact can have multiple names. Usually, Fossil knows each artifact by just a single name called the "display name". But it is possible for Fossil to know an artifact by multiple names from different hashes. In that case, Fossil uses the display name for output, but continues to accept the alternative names as command-line arguments or as parameters to webpage URLs. When referring to artifacts in using tty commands or webpage URLs, it is sufficient to specify a unique prefix for the artifact name. If the input prefix is not unique, Fossil will show an error. Within a structural artifact, however, all references to other artifacts must be the complete hash. ................................................................................ Prior to Fossil version 2.0, all names were formed from the SHA1 hash of the artifact. The key innovation in Fossil 2.0 was adding support for alternative hash algorithms. <a name="structural"></a> <h2>2.0 Structural Artifacts</h2> A structural artifact is an artifact that has a particular format and that is used to define the relationships between other artifacts in the repository. Fossil recognizes the following kinds of structural artifacts: <ul> <li> [#manifest | Manifests] </li> |
|
|
|
<
<
<
<
<
|
|
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
..
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
|
<h2>1.0 Artifact Names</h2> Each artifact in the repository is named by a hash of its content. No prefixes, suffixes, or other information is added to an artifact before the hash is computed. The artifact name is just the (lower-case hexadecimal) hash of the raw artifact. Fossil currently computes artifact names using either SHA1 or SHA3-256. It is relatively easy to add new algorithms in the future, but there are no plans to do so at this time. When referring to artifacts in using tty commands or webpage URLs, it is sufficient to specify a unique prefix for the artifact name. If the input prefix is not unique, Fossil will show an error. Within a structural artifact, however, all references to other artifacts must be the complete hash. ................................................................................ Prior to Fossil version 2.0, all names were formed from the SHA1 hash of the artifact. The key innovation in Fossil 2.0 was adding support for alternative hash algorithms. <a name="structural"></a> <h2>2.0 Structural Artifacts</h2> A structural artifact is an artifact with a particular format that is used to define the relationships between other artifacts in the repository. Fossil recognizes the following kinds of structural artifacts: <ul> <li> [#manifest | Manifests] </li> |
Changes to www/fossil-v-git.wiki.
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
website using Fossil can be done in minutes, whereas doing the same using
Git requires hours or days.
<h3>3.2 Database</h3>
The baseline data structures for Fossil and Git are the same (modulo
formatting details). Both systems store check-ins as immutable
objects referencing their immediate ancestors and named by their SHA1 hash.
The difference is that Git stores its objects as individual files
in the ".git" folder or compressed into
bespoke "pack-files", whereas Fossil stores its objects in a
relational ([https://www.sqlite.org/|SQLite]) database file. To put it
another way, Git uses an ad-hoc pile-of-files key/value database whereas
Fossil uses a proven, general-purpose SQL database. This
|
| > |
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
website using Fossil can be done in minutes, whereas doing the same using Git requires hours or days. <h3>3.2 Database</h3> The baseline data structures for Fossil and Git are the same (modulo formatting details). Both systems store check-ins as immutable objects referencing their immediate ancestors and named by a cryptographic hash of the check-in content. The difference is that Git stores its objects as individual files in the ".git" folder or compressed into bespoke "pack-files", whereas Fossil stores its objects in a relational ([https://www.sqlite.org/|SQLite]) database file. To put it another way, Git uses an ad-hoc pile-of-files key/value database whereas Fossil uses a proven, general-purpose SQL database. This |
Changes to www/pop.wiki.
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
repositories are fully synchronized). The local state for each repository is private to that repository. The global state represents the content of the project. The local state identifies the authorized users and access policies for a particular repository.</p></li> <li><p>The global state of a repository is an unordered collection of artifacts. Each artifact is named by its SHA1 hash encoded in lowercase hexadecimal. In many contexts, the name can be abbreviated to a unique prefix. A five- or six-character prefix usually suffices to uniquely identify a file.</p></li> <li><p>Because artifacts are named by their SHA1 hash, all artifacts are immutable. Any change to the content of an artifact also changes the hash that forms the artifacts name, thus creating a new artifact. Both the old original version of the artifact and the new change are preserved under different names.</p></li> <li><p>It is theoretically possible for two artifacts with different content to share the same hash. But finding two such artifacts is so incredibly difficult and unlikely that we consider it to be an impossibility.</p></li> <li><p>The signature of an artifact is the SHA1 hash of the artifact itself, exactly as it would appear in a disk file. No prefix or meta-information about the artifact is added before computing the hash. So you can always find the SHA1 signature of a file by using the "sha1sum" command-line utility.</p></li> <li><p>The artifacts that comprise the global state of a repository are the complete global state of that repository. The SQLite database that holds the repository contains additional information about linkages between artifacts, but all of that added information can be discarded and reconstructed by rescanning the content artifacts.</p></li> |
| > | | | | | |
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
repositories are fully synchronized). The local state for each repository is private to that repository. The global state represents the content of the project. The local state identifies the authorized users and access policies for a particular repository.</p></li> <li><p>The global state of a repository is an unordered collection of artifacts. Each artifact is named by a cryptographic hash (SHA1 or SHA3-256) encoded in lowercase hexadecimal. In many contexts, the name can be abbreviated to a unique prefix. A five- or six-character prefix usually suffices to uniquely identify a file.</p></li> <li><p>Because artifacts are named by a cryptographic hash, all artifacts are immutable. Any change to the content of an artifact also changes the hash that forms the artifacts name, thus creating a new artifact. Both the old original version of the artifact and the new change are preserved under different names.</p></li> <li><p>It is theoretically possible for two artifacts with different content to share the same hash. But finding two such artifacts is so incredibly difficult and unlikely that we consider it to be an impossibility.</p></li> <li><p>The signature of an artifact is the cryptographic hash of the artifact itself, exactly as it would appear in a disk file. No prefix or meta-information about the artifact is added before computing the hash. So you can always find the signature of a file by using the "sha1sum" or "sha3sum" or similar command-line utilities.</p></li> <li><p>The artifacts that comprise the global state of a repository are the complete global state of that repository. The SQLite database that holds the repository contains additional information about linkages between artifacts, but all of that added information can be discarded and reconstructed by rescanning the content artifacts.</p></li> |
Changes to www/selfcheck.wiki.
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
..
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
|
recoverable, fossil makes sure it can extract an exact replica of every content file that it changes just prior to transaction commit. So during the course of check-in (or other repository operation) many different files in the repository might be modified. Some files are simply compressed. Other files are delta encoded and then compressed. While all this is going on, fossil makes a record of every file that is encoded and the SHA1 hash of the original content of that file. Then just before transaction commit, fossil re-extracts the original content of all files that were written, computes the SHA1 checksum again, and verifies that the checksums match. If anything does not match up, an error message is printed and the transaction rolls back. So, in other words, fossil always checks to make sure it can re-extract a file before it commits a change to that file. Hence bugs in fossil are unlikely to corrupt the repository in a way that prevents us from extracting historical versions of ................................................................................ Manifest artifacts that define a check-in have two fields (the R-card and Z-card) that record MD5 hashes of the manifest itself and of all other files in the manifest. Prior to any check-in commit, these checksums are verified to ensure that the check-in agrees exactly with what is on disk. Similarly, the repository checksum is verified after a checkout to make sure that the entire repository was checked out correctly. Note that these added checks use a different hash (MD5 instead of SHA1) in order to avoid common-mode failures in the hash algorithm implementation. <h2>Checksums On Control Artifacts And Deltas</h2> Every [./fileformat.wiki | control artifact] in a fossil repository contains a "Z-card" bearing an MD5 checksum over the rest of the |
|
|
|
|
|
|
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
..
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
|
recoverable, fossil makes sure it can extract an exact replica of every content file that it changes just prior to transaction commit. So during the course of check-in (or other repository operation) many different files in the repository might be modified. Some files are simply compressed. Other files are delta encoded and then compressed. While all this is going on, fossil makes a record of every file and the SHA1 or SHA3-256 hash of the original content of that file. Then just before transaction commit, fossil re-extracts the original content of all files that were written, recomputes the hash, and verifies that the recomputed hash still matches. If anything does not match up, an error message is printed and the transaction rolls back. So, in other words, fossil always checks to make sure it can re-extract a file before it commits a change to that file. Hence bugs in fossil are unlikely to corrupt the repository in a way that prevents us from extracting historical versions of ................................................................................ Manifest artifacts that define a check-in have two fields (the R-card and Z-card) that record MD5 hashes of the manifest itself and of all other files in the manifest. Prior to any check-in commit, these checksums are verified to ensure that the check-in agrees exactly with what is on disk. Similarly, the repository checksum is verified after a checkout to make sure that the entire repository was checked out correctly. Note that these added checks use a different hash algorithm (MD5) in order to avoid common-mode failures in the hash algorithm implementation. <h2>Checksums On Control Artifacts And Deltas</h2> Every [./fileformat.wiki | control artifact] in a fossil repository contains a "Z-card" bearing an MD5 checksum over the rest of the |
Changes to www/shunning.wiki.
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
disrupting the operation of Fossil.
<h2>Shunning</h2>
Fossil provides a mechanism called "shunning" for removing content from
a repository.
Every Fossil repository maintains a list of the SHA1 hash names of
"shunned" artifacts.
Fossil will refuse to push or pull any shunned artifact.
Furthermore, all shunned artifacts (but not the shunning list
itself) are removed from the
repository whenever the repository is reconstructed using the
"rebuild" command.
|
| |
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
disrupting the operation of Fossil. <h2>Shunning</h2> Fossil provides a mechanism called "shunning" for removing content from a repository. Every Fossil repository maintains a list of the hash names of "shunned" artifacts. Fossil will refuse to push or pull any shunned artifact. Furthermore, all shunned artifacts (but not the shunning list itself) are removed from the repository whenever the repository is reconstructed using the "rebuild" command. |
Changes to www/sync.wiki.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ... 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 ... 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 ... 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 ... 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 |
<p>This document describes the wire protocol used to synchronize content between two Fossil repositories.</p> <h2>1.0 Overview</h2> <p>The global state of a fossil repository consists of an unordered collection of artifacts. Each artifact is identified by its SHA1 hash expressed as a 40-character lower-case hexadecimal string. Synchronization is the process of sharing artifacts between servers so that all servers have copies of all artifacts. Because artifacts are unordered, the order in which artifacts are received at a server is inconsequential. It is assumed that the SHA1 hashes of artifacts are unique - that every artifact has a different SHA1 hash. To a first approximation, synchronization proceeds by sharing lists SHA1 hashes of available artifacts, then sharing those artifacts that are not found on one side or the other of the connection. In practice, a repository might contain millions of artifacts. The list of SHA1 hashes for this many artifacts can be large. So optimizations are employed that usually reduce the number of SHA1 hashes that need to be shared to a few hundred.</p> <p>Each repository also has local state. The local state determines the web-page formatting preferences, authorized users, ticket formats, and similar information that varies from one repository to another. The local state is not using transferred during a sync. Except, some local state is transferred during a [/help?cmd=clone|clone] ................................................................................ or the artifact delta is the first <i>size</i> bytes of the x-fossil content that immediately follow the newline that terminates the file card. </p> <p>The first argument of a file card is the ID of the artifact that is being transferred. The artifact ID is the lower-case hexadecimal representation of the SHA1 hash of the artifact. The last argument of the file card is the number of bytes of payload that immediately follow the file card. If the file card has only two arguments, that means the payload is the complete content of the artifact. If the file card has three arguments, then the payload is a delta and second argument is the ID of another artifact that is the source of the delta.</p> ................................................................................ <blockquote> <b>cfile</b> <i>artifact-id usize csize</i> <b>\n</b> <i>content</i><br> <b>cfile</b> <i>artifact-id delta-artifact-id usize csize</i> <b>\n</b> <i>content</i><br> </blockquote> <p>The first argument of the cfile card is the ID of the artifact that is being transferred. The artifact ID is the lower-case hexadecimal representation of the SHA1 hash of the artifact. The second argument of the cfile card is the original size in bytes of the artifact. The last argument of the cfile card is the number of compressed bytes of payload that immediately follow the cfile card. If the cfile card has only three arguments, that means the payload is the complete content of the artifact. If the cfile card has four arguments, then the payload is a delta and the second argument is the ID of another artifact that is the source of the delta and the third argument is the original size of the ................................................................................ <blockquote> <b>uvfile</b> <i>name mtime hash size flags</i> <b>\n</b> <i>content</i> </blockquote> <p>The <i>name</i> field is the name of the unversioned file. The <i>mtime</i> is the last modification time of the file in seconds since 1970. The <i>hash</i> field is the SHA1 hash of the content for the unversioned file, or "<b>-</b>" for deleted content. The <i>size</i> field is the (uncompressed) size of the content in bytes. The <i>flags</i> field is an integer which is interpreted as an array of bits. The 0x0004 bit of <i>flags</i> indicates that the <i>content</i> is to be omitted. The content might be omitted if it is too large to transmit, or if the sender merely wants to update the modification time of the file without changing the files content. ................................................................................ <blockquote> <b>uvigot</b> <i>name mtime hash size</i> </blockquote> <p>The <i>name</i> argument is the name of an unversioned file. The <i>mtime</i> is the last modification time of the unversioned file in seconds since 1970. The <i>hash</i> is the SHA1 hash of the unversioned file content, or "<b>-</b>" if the file has been deleted. The <i>size</i> is the uncompressed size of the file in bytes. <p>When the server sees a "pragma uv-hash" card for which the hash does not match, it sends uvigot cards for every unversioned file that it holds. The client will use this information to figure out which unversioned files need to be synchronized. The server might also send a uvigot card when it receives a uvgimme card |
| | | | | | | | | | | | | | |
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ... 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 ... 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 ... 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 ... 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 |
<p>This document describes the wire protocol used to synchronize content between two Fossil repositories.</p> <h2>1.0 Overview</h2> <p>The global state of a fossil repository consists of an unordered collection of artifacts. Each artifact is identified by a cryptographic hash of its content, expressed as a lower-case hexadecimal string. Synchronization is the process of sharing artifacts between servers so that all servers have copies of all artifacts. Because artifacts are unordered, the order in which artifacts are received at a server is inconsequential. It is assumed that the hash names of artifacts are unique - that every artifact has a different hash. To a first approximation, synchronization proceeds by sharing lists hash values for available artifacts, then sharing the content of artifacts whose names are missing from one side or the other of the connection. In practice, a repository might contain millions of artifacts. The list of hash names for this many artifacts can be large. So optimizations are employed that usually reduce the number of hashes that need to be shared to a few hundred.</p> <p>Each repository also has local state. The local state determines the web-page formatting preferences, authorized users, ticket formats, and similar information that varies from one repository to another. The local state is not using transferred during a sync. Except, some local state is transferred during a [/help?cmd=clone|clone] ................................................................................ or the artifact delta is the first <i>size</i> bytes of the x-fossil content that immediately follow the newline that terminates the file card. </p> <p>The first argument of a file card is the ID of the artifact that is being transferred. The artifact ID is the lower-case hexadecimal representation of the name hash for the artifact. The last argument of the file card is the number of bytes of payload that immediately follow the file card. If the file card has only two arguments, that means the payload is the complete content of the artifact. If the file card has three arguments, then the payload is a delta and second argument is the ID of another artifact that is the source of the delta.</p> ................................................................................ <blockquote> <b>cfile</b> <i>artifact-id usize csize</i> <b>\n</b> <i>content</i><br> <b>cfile</b> <i>artifact-id delta-artifact-id usize csize</i> <b>\n</b> <i>content</i><br> </blockquote> <p>The first argument of the cfile card is the ID of the artifact that is being transferred. The artifact ID is the lower-case hexadecimal representation of the name hash for the artifact. The second argument of the cfile card is the original size in bytes of the artifact. The last argument of the cfile card is the number of compressed bytes of payload that immediately follow the cfile card. If the cfile card has only three arguments, that means the payload is the complete content of the artifact. If the cfile card has four arguments, then the payload is a delta and the second argument is the ID of another artifact that is the source of the delta and the third argument is the original size of the ................................................................................ <blockquote> <b>uvfile</b> <i>name mtime hash size flags</i> <b>\n</b> <i>content</i> </blockquote> <p>The <i>name</i> field is the name of the unversioned file. The <i>mtime</i> is the last modification time of the file in seconds since 1970. The <i>hash</i> field is the hash of the content for the unversioned file, or "<b>-</b>" for deleted content. The <i>size</i> field is the (uncompressed) size of the content in bytes. The <i>flags</i> field is an integer which is interpreted as an array of bits. The 0x0004 bit of <i>flags</i> indicates that the <i>content</i> is to be omitted. The content might be omitted if it is too large to transmit, or if the sender merely wants to update the modification time of the file without changing the files content. ................................................................................ <blockquote> <b>uvigot</b> <i>name mtime hash size</i> </blockquote> <p>The <i>name</i> argument is the name of an unversioned file. The <i>mtime</i> is the last modification time of the unversioned file in seconds since 1970. The <i>hash</i> is the SHA1 or SHA3-256 hash of the unversioned file content, or "<b>-</b>" if the file has been deleted. The <i>size</i> is the uncompressed size of the file in bytes. <p>When the server sees a "pragma uv-hash" card for which the hash does not match, it sends uvigot cards for every unversioned file that it holds. The client will use this information to figure out which unversioned files need to be synchronized. The server might also send a uvigot card when it receives a uvgimme card |
Changes to www/tech_overview.wiki.
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
...
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
|
All of the original uncompressed and undeltaed artifacts can be extracted from a Fossil repository database using the [/help/deconstruct | fossil deconstruct] command. Individual artifacts can be extracted using the [/help/artifact | fossil artifact] command. When accessing the repository database using raw SQL and the [/help/sqlite3 | fossil sql] command, the extension function "<tt>content()</tt>" with a single argument which is the SHA1 hash of an artifact will return the complete undeleted and uncompressed content of that artifact. Going the other way, the [/help/reconstruct | fossil reconstruct] command will scan a directory hierarchy and add all files found to a new repository database. The [/help/import | fossil import] command works by reading the input git-fast-export stream and using it to construct ................................................................................ The set of canonical artifacts for a project - the global state for the project - is intended to be an append-only database. In other words, new artifacts can be added but artifacts can never be removed. But it sometimes happens that inappropriate content is mistakenly or maliciously added to a repository. The only way to get rid of the undesired content is to [./shunning.wiki | "shun"] it. The "shun" table in the repository database records the SHA1 hash of all shunned artifacts. The shun table can be pushed or pulled using the [/help/config | fossil config] command with the "shun" AREA argument. The shun table is also copied during a [/help/clone | clone]. <a name='localdb'></a> |
|
>
|
|
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
...
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
|
All of the original uncompressed and undeltaed artifacts can be extracted from a Fossil repository database using the [/help/deconstruct | fossil deconstruct] command. Individual artifacts can be extracted using the [/help/artifact | fossil artifact] command. When accessing the repository database using raw SQL and the [/help/sqlite3 | fossil sql] command, the extension function "<tt>content()</tt>" with a single argument which is the SHA1 or SHA3-256 hash of an artifact will return the complete undeleted and uncompressed content of that artifact. Going the other way, the [/help/reconstruct | fossil reconstruct] command will scan a directory hierarchy and add all files found to a new repository database. The [/help/import | fossil import] command works by reading the input git-fast-export stream and using it to construct ................................................................................ The set of canonical artifacts for a project - the global state for the project - is intended to be an append-only database. In other words, new artifacts can be added but artifacts can never be removed. But it sometimes happens that inappropriate content is mistakenly or maliciously added to a repository. The only way to get rid of the undesired content is to [./shunning.wiki | "shun"] it. The "shun" table in the repository database records the hash values for all shunned artifacts. The shun table can be pushed or pulled using the [/help/config | fossil config] command with the "shun" AREA argument. The shun table is also copied during a [/help/clone | clone]. <a name='localdb'></a> |
Changes to www/theory1.wiki.
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
An artifact is a list of bytes - a "file" in the usual manner of thinking.
Many artifacts are simply the content of source files that have
been checked into the Fossil repository. Call these "content artifacts".
Other artifacts, known as
"control artifacts", contain ASCII text in a particular format that
defines relationships between other artifacts, such as which
content artifacts that go together to form a particular version of the
project. Each artifact is named by its SHA1 hash and is thus immutable.
Artifacts can be added to the database but not removed (if we ignore
the exceptional case of [./shunning.wiki | shunning].) Repositories
synchronize by computing the union of their artifact sets. SQL and
relation theory play no role in any of this.
SQL enters the picture only in the implementation details. The current
implementation of Fossil stores each artifact as a BLOB in an SQLite
|
| > |
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
An artifact is a list of bytes - a "file" in the usual manner of thinking. Many artifacts are simply the content of source files that have been checked into the Fossil repository. Call these "content artifacts". Other artifacts, known as "control artifacts", contain ASCII text in a particular format that defines relationships between other artifacts, such as which content artifacts that go together to form a particular version of the project. Each artifact is named by its SHA1 or SHA3-256 hash and is thus immutable. Artifacts can be added to the database but not removed (if we ignore the exceptional case of [./shunning.wiki | shunning].) Repositories synchronize by computing the union of their artifact sets. SQL and relation theory play no role in any of this. SQL enters the picture only in the implementation details. The current implementation of Fossil stores each artifact as a BLOB in an SQLite |
Changes to www/webpage-ex.md.
121 122 123 124 125 126 127 128 129 130 131 132 |
* <a target='_blank' class='exbtn'
href='$ROOT/bigbloblist'>Example</a>
The largest objects in the repository.
* <a target='_blank' class='exbtn'
href='$ROOT/hash-collisions'>Example</a>
SHA1 prefix collisions
* <a target='_blank' class='exbtn'
href='$ROOT/sitemap'>Example</a>
The "sitemap" containing links to many other pages
|
| |
121 122 123 124 125 126 127 128 129 130 131 132 |
* <a target='_blank' class='exbtn'
href='$ROOT/bigbloblist'>Example</a>
The largest objects in the repository.
* <a target='_blank' class='exbtn'
href='$ROOT/hash-collisions'>Example</a>
Hash prefix collisions
* <a target='_blank' class='exbtn'
href='$ROOT/sitemap'>Example</a>
The "sitemap" containing links to many other pages
|
Changes to www/whyusefossil.wiki.
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 |
to the first check-in of a branch. The name assigned by this special tag automatically propagates to all direct children. </ul> </ul> <li><p><b>Why version control is important (reprise)</b> <ol type="A"> <li><p>Every check-in and every individual file has a unique name - its SHA1 hash. Team members can unambiguously identify any specific version of the overall project or any specific version of an individual file. <li><p>Any historical version of the whole project or of any individual file can be easily recreated at any time and by any team member. <li><p>Accidental changes to files can be detected by recomputing their SHA1 hash. <li><p>Files of unknown origin can be identified using their SHA1 hash. <li><p>Developers are able to work in parallel, review each others work, and easily merge their changes together. External revisions to the baseline can be easily incorporated into the latest changes. <li><p>Developers can follow experimental lines of development, then revert back to an earlier stable version if the experiment does not work out. Creativity is enhanced by allowing crazy ideas to be investigated without destabilizing the project. |
| > | | |
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 |
to the first check-in of a branch. The name assigned by this special tag automatically propagates to all direct children. </ul> </ul> <li><p><b>Why version control is important (reprise)</b> <ol type="A"> <li><p>Every check-in and every individual file has a unique name - its SHA1 or SHA3-256 hash. Team members can unambiguously identify any specific version of the overall project or any specific version of an individual file. <li><p>Any historical version of the whole project or of any individual file can be easily recreated at any time and by any team member. <li><p>Accidental changes to files can be detected by recomputing their cryptographic hash. <li><p>Files of unknown origin can be identified using their hash. <li><p>Developers are able to work in parallel, review each others work, and easily merge their changes together. External revisions to the baseline can be easily incorporated into the latest changes. <li><p>Developers can follow experimental lines of development, then revert back to an earlier stable version if the experiment does not work out. Creativity is enhanced by allowing crazy ideas to be investigated without destabilizing the project. |