Fossil

Check-in [f46458d5]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Reworked the basic structure of pass InitCSets to keep memory consumption down. Now incremental creates, breaks, saves, and releases changesets, instead of piling them on before saving all at the end. Memory tracking confirms that this changes the accumulating mountain into a near-constant usage, with the expected spikes from the breaking.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: f46458d5bdc221859c6b009de9b8e5dcbbc2534a
User & Date: aku 2008-02-17 02:06:19
References
2008-02-23
06:33
Fixed bug made in [f46458d5bd] which prevented the saving of the changesets generated by the breaking of the internal dependencies. check-in: b3d61d78 user: aku tags: trunk
Context
2008-02-23
06:33
Fixed bug made in [f46458d5bd] which prevented the saving of the changesets generated by the breaking of the internal dependencies. check-in: b3d61d78 user: aku tags: trunk
2008-02-21
05:13
Added high-level logging for memory tracing to the code breaking the preliminary changesets. First runs indicate that the DEPC array becomes so very large, caused by a high amount of indirect dependencies (several hundred). check-in: c2ad73ed user: aku tags: trunk
2008-02-20
06:03
Modified the changeset class to move handling of the changeset lists to fully after their creation and storage. This is item (3) in cvsfossil.txt. The results do not satisfy however. During the creation of each changeset memory usage is (fractonally) lower, however at the end, after all changesets haven been loaded memory usage is consistently higher. The reason for that is not known. I am saving this for possible future evolution and usage, but will not pursue this further right now. The gains seem to be too small compared to the overall loss. InitializeBreakstate is likely a better target, despite its complexity. check-in: faf57d74 user: aku tags: trunk
2008-02-17
02:06
Reworked the basic structure of pass InitCSets to keep memory consumption down. Now incremental creates, breaks, saves, and releases changesets, instead of piling them on before saving all at the end. Memory tracking confirms that this changes the accumulating mountain into a near-constant usage, with the expected spikes from the breaking. check-in: f46458d5 user: aku tags: trunk
2008-02-16
06:46
Extended pass InitCsets and underlying code with more log output geared towards memory introspection, and added markers for special locations. Extended my notes with general observations from the first test runs over my example CVS repositories. check-in: 27ed4f7d user: aku tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to tools/cvs2fossil/getmemoryseries.tcl.

1
2
3
4
5
6
7
8
9
10
11
12
13
..
43
44
45
46
47
48
49









50
51
52
53
54
55
56
57
58
59
60
#!/bin/bash
# -*- tcl -*- \
exec tclsh "$0" ${1+"$@"}

package require csv
foreach {in outbasic outmarker plot} $argv break

set in [open $in        r]
set ba [open $outbasic  w]
set mr [open $outmarker w]

puts $ba "\# Time Memory MaxMemory"
puts $mr "\# Time Memory"
................................................................................
set    f [open $plot w]
puts  $f ""
puts  $f "plot \"$outbasic\" using 1:2 title 'Memory'     with steps, \\"
puts  $f "     \"$outbasic\" using 1:3 title 'Max Memory' with steps"
puts  $f "pause -1"
puts  $f ""
close $f









exit

# Comparison to baseline
plot "basic.dat"     using 1:2 title 'Memory Base'    with steps lt rgb "blue", \
     "newbasic.dat"  using 1:2 title 'Memory Current' with steps lt rgb "red", \

# Comparison to baseline via normalization - need math op (div)
plot "basic.dat"     using 1:2 title 'Memory Base'    with steps lt rgb "blue", \
     "newbasic.dat"  using 1:2 title 'Memory Current' with steps lt rgb "red", \







|







 







>
>
>
>
>
>
>
>
>











1
2
3
4
5
6
7
8
9
10
11
12
13
..
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#!/bin/bash
# -*- tcl -*- \
exec tclsh "$0" ${1+"$@"}

package require csv
foreach {in outbasic outmarker plot outbasicold} $argv break

set in [open $in        r]
set ba [open $outbasic  w]
set mr [open $outmarker w]

puts $ba "\# Time Memory MaxMemory"
puts $mr "\# Time Memory"
................................................................................
set    f [open $plot w]
puts  $f ""
puts  $f "plot \"$outbasic\" using 1:2 title 'Memory'     with steps, \\"
puts  $f "     \"$outbasic\" using 1:3 title 'Max Memory' with steps"
puts  $f "pause -1"
puts  $f ""
close $f

# Generate gnuplot control file for comparison of series
set    f [open ${plot}-compare w]
puts  $f ""
puts  $f "plot \"$outbasicold\" using 1:2 title 'Memory Old' with steps, \\"
puts  $f "     \"$outbasic\"    using 1:2 title 'Memory New' with steps"
puts  $f "pause -1"
puts  $f ""
close $f
exit

# Comparison to baseline
plot "basic.dat"     using 1:2 title 'Memory Base'    with steps lt rgb "blue", \
     "newbasic.dat"  using 1:2 title 'Memory Current' with steps lt rgb "red", \

# Comparison to baseline via normalization - need math op (div)
plot "basic.dat"     using 1:2 title 'Memory Base'    with steps lt rgb "blue", \
     "newbasic.dat"  using 1:2 title 'Memory Current' with steps lt rgb "red", \


Changes to tools/cvs2fossil/lib/c2f_pinitcsets.tcl.

123
124
125
126
127
128
129
130
131


132
133

134
135
136
137




138
139
140
141
142
143
144
...
166
167
168
169
170
171
172






173

174
175
176
177
178
179
180
...
194
195
196
197
198
199
200
201
202



203
204
205
206
207
208
209
...
210
211
212
213
214
215
216
217
218



219
220
221
222
223
224




225
226
227
228
229
230
231
232
...
249
250
251
252
253
254
255
256
257


258
259
260
261
262
263
264
265
266
267
268


269
270
271
272
273
274
275
...
278
279
280
281
282
283
284
285
286


287
288
289
290
291
292
293
294
295
296
297


298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
    }

    typemethod run {} {
	# Pass manager interface. Executed to perform the
	# functionality of the pass.

	state transaction {
	    CreateRevisionChangesets  ; # Group file revisions into csets.
	    BreakInternalDependencies ; # Split the csets based on internal conflicts.


	    CreateSymbolChangesets    ; # Create csets for tags and branches.
	    PersistTheChangesets

	}

	repository printcsetstatistics
	integrity changesets




	return
    }

    typemethod discard {} {
	# Pass manager interface. Executed for all passes after the
	# run passes, to remove all data of this pass from the state,
	# as being out of date.
................................................................................
	# single commit contained revisions several hours apart,
	# likely due to trouble on the server hosting the repository.

	# We order the revisions here by time, this will help the
	# later passes (avoids joins later to get at the ordering
	# info).







	set n 0


	set lastmeta    {}
	set lastproject {}
	set revisions   {}

	# Note: We could have written this loop to create the csets
	#       early, extending them with all their revisions. This
................................................................................

	    if {$lastmeta != $mid} {
		if {[llength $revisions]} {
		    incr n
		    set  p [repository projectof $lastproject]
		    log write 14 initcsets meta_cset_begin
		    mem::mark
		    project::rev %AUTO% $p rev $lastmeta $revisions
		    log write 14 initcsets meta_cset_done



		    mem::mark
		    set revisions {}
		}
		set lastmeta    $mid
		set lastproject $pid
	    }
	    lappend revisions $rid
................................................................................
	}

	if {[llength $revisions]} {
	    incr n
	    set  p [repository projectof $lastproject]
	    log write 14 initcsets meta_cset_begin
	    mem::mark
	    project::rev %AUTO% $p rev $lastmeta $revisions
	    log write 14 initcsets meta_cset_done



	    mem::mark
	}

	log write 14 initcsets meta_done
	mem::mark





	log write 4 initcsets "Created [nsp $n {revision changeset}]"
	return
    }

    proc CreateSymbolChangesets {} {
	log write 3 initcsets {Create changesets based on symbols}
	mem::mark

................................................................................
	    WHERE T.sid = S.sid
	    ORDER BY S.sid, T.tid
	}] {
	    if {$lastsymbol != $sid} {
		if {[llength $tags]} {
		    incr n
		    set  p [repository projectof $lastproject]
		    project::rev %AUTO% $p sym::tag $lastsymbol $tags
		    set tags {}


		}
		set lastsymbol  $sid
		set lastproject $pid
	    }
	    lappend tags $tid
	}

	if {[llength $tags]} {
	    incr n
	    set  p [repository projectof $lastproject]
	    project::rev %AUTO% $p sym::tag $lastsymbol $tags


	}

	set lastsymbol {}
	set lasproject {}
	set branches   {}

	foreach {sid bid pid} [state run {
................................................................................
	    WHERE B.sid  = S.sid
	    ORDER BY S.sid, B.bid
	}] {
	    if {$lastsymbol != $sid} {
		if {[llength $branches]} {
		    incr n
		    set  p [repository projectof $lastproject]
		    project::rev %AUTO% $p sym::branch $lastsymbol $branches
		    set branches {}


		}
		set lastsymbol  $sid
		set lastproject $pid
	    }
	    lappend branches $bid
	}

	if {[llength $branches]} {
	    incr n
	    set  p [repository projectof $lastproject]
	    project::rev %AUTO% $p sym::branch $lastsymbol $branches


	}

	log write 4 initcsets "Created [nsp $n {symbol changeset}]"
	mem::mark
	return
    }

    proc BreakInternalDependencies {} {
	# This code operates on the revision changesets created by
	# 'CreateRevisionChangesets'. As such it has to follow after
	# it, before the symbol changesets are made. The changesets
	# are inspected for internal conflicts and any such are broken
	# by splitting the problematic changeset into multiple
	# fragments. The results are changesets which have no internal
	# dependencies, only external ones.

	log write 3 initcsets {Break internal dependencies}
	mem::mark
	set old [llength [project::rev all]]

	foreach cset [project::rev all] {
	    $cset breakinternaldependencies
	}

	set n [expr {[llength [project::rev all]] - $old}]
	log write 4 initcsets "Created [nsp $n {additional revision changeset}]"
	log write 4 initcsets Ok.
	mem::mark
	return
    }

    proc PersistTheChangesets {} {
	log write 3 initcsets "Saving [nsp [llength [project::rev all]] {initial changeset}] to the persistent state"

	foreach cset [project::rev all] {
	    $cset persist
	}

	log write 4 initcsets Ok.
	return
    }

    # # ## ### ##### ######## #############
    ## Configuration

    pragma -hasinstances   no ; # singleton
    pragma -hastypeinfo    no ; # no introspection
    pragma -hastypedestroy no ; # immortal








|
|
>
>
|
<
>




>
>
>
>







 







>
>
>
>
>
>
|
>







 







|

>
>
>







 







|

>
>
>






>
>
>
>
|







 







|

>
>










|
>
>







 







|

>
>










|
>
>


|




<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<







123
124
125
126
127
128
129
130
131
132
133
134

135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
...
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
...
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
...
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
...
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
...
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335



































336
337
338
339
340
341
342
    }

    typemethod run {} {
	# Pass manager interface. Executed to perform the
	# functionality of the pass.

	state transaction {
	    CreateRevisionChangesets  ; # Group file revisions into
					# preliminary csets and split
					# them based on internal
					# conflicts.
	    CreateSymbolChangesets    ; # Create csets for tags and

					# branches.
	}

	repository printcsetstatistics
	integrity changesets

	# Load the changesets for use by the next passes.
	project::rev load ::vc::fossil::import::cvs::repository
	project::rev loadcounter
	return
    }

    typemethod discard {} {
	# Pass manager interface. Executed for all passes after the
	# run passes, to remove all data of this pass from the state,
	# as being out of date.
................................................................................
	# single commit contained revisions several hours apart,
	# likely due to trouble on the server hosting the repository.

	# We order the revisions here by time, this will help the
	# later passes (avoids joins later to get at the ordering
	# info).

	# The changesets made from these groups are immediately
	# inspected for internal conflicts and any such are broken by
	# splitting the problematic changeset into multiple
	# fragments. The results are changesets which have no internal
	# dependencies, only external ones.

	set n  0
	set nx 0

	set lastmeta    {}
	set lastproject {}
	set revisions   {}

	# Note: We could have written this loop to create the csets
	#       early, extending them with all their revisions. This
................................................................................

	    if {$lastmeta != $mid} {
		if {[llength $revisions]} {
		    incr n
		    set  p [repository projectof $lastproject]
		    log write 14 initcsets meta_cset_begin
		    mem::mark
		    set cset [project::rev %AUTO% $p rev $lastmeta $revisions]
		    log write 14 initcsets meta_cset_done
		    $cset breakinternaldependencies nx
		    $cset persist
		    $cset destroy
		    mem::mark
		    set revisions {}
		}
		set lastmeta    $mid
		set lastproject $pid
	    }
	    lappend revisions $rid
................................................................................
	}

	if {[llength $revisions]} {
	    incr n
	    set  p [repository projectof $lastproject]
	    log write 14 initcsets meta_cset_begin
	    mem::mark
	    set cset [project::rev %AUTO% $p rev $lastmeta $revisions]
	    log write 14 initcsets meta_cset_done
	    $cset breakinternaldependencies nx
	    $cset persist
	    $cset destroy
	    mem::mark
	}

	log write 14 initcsets meta_done
	mem::mark

	log write 4 initcsets "Created and saved [nsp $n {revision changeset}]"
	log write 4 initcsets "Created and saved [nsp $nx {additional revision changeset}]"

	mem::mark
	log write 4 initcsets Ok.
	return
    }

    proc CreateSymbolChangesets {} {
	log write 3 initcsets {Create changesets based on symbols}
	mem::mark

................................................................................
	    WHERE T.sid = S.sid
	    ORDER BY S.sid, T.tid
	}] {
	    if {$lastsymbol != $sid} {
		if {[llength $tags]} {
		    incr n
		    set  p [repository projectof $lastproject]
		    set cset [project::rev %AUTO% $p sym::tag $lastsymbol $tags]
		    set tags {}
		    $cset persist
		    $cset destroy
		}
		set lastsymbol  $sid
		set lastproject $pid
	    }
	    lappend tags $tid
	}

	if {[llength $tags]} {
	    incr n
	    set  p [repository projectof $lastproject]
	    set cset [project::rev %AUTO% $p sym::tag $lastsymbol $tags]
	    $cset persist
	    $cset destroy
	}

	set lastsymbol {}
	set lasproject {}
	set branches   {}

	foreach {sid bid pid} [state run {
................................................................................
	    WHERE B.sid  = S.sid
	    ORDER BY S.sid, B.bid
	}] {
	    if {$lastsymbol != $sid} {
		if {[llength $branches]} {
		    incr n
		    set  p [repository projectof $lastproject]
		    set cset [project::rev %AUTO% $p sym::branch $lastsymbol $branches]
		    set branches {}
		    $cset persist
		    $cset destroy
		}
		set lastsymbol  $sid
		set lastproject $pid
	    }
	    lappend branches $bid
	}

	if {[llength $branches]} {
	    incr n
	    set  p [repository projectof $lastproject]
	    set cset [project::rev %AUTO% $p sym::branch $lastsymbol $branches]
	    $cset persist
	    $cset destroy
	}

	log write 4 initcsets "Created and saved [nsp $n {symbol changeset}]"
	mem::mark
	return
    }




































    # # ## ### ##### ######## #############
    ## Configuration

    pragma -hasinstances   no ; # singleton
    pragma -hastypeinfo    no ; # no introspection
    pragma -hastypedestroy no ; # immortal

Changes to tools/cvs2fossil/lib/c2f_prev.tcl.

59
60
61
62
63
64
65















66
67
68
69
70
71
72
...
130
131
132
133
134
135
136
137

138
139
140
141
142
143
144
...
286
287
288
289
290
291
292

293
294
295
296
297
298
299
	    set key [list $cstype $iid]
	    set myitemmap($key) $self
	    lappend mytitems $key
	    log write 8 csets {MAP+ item <$key> $self = [$self str]}
	}
	return
    }
















    method str {} {
	set str    "<"
	set detail ""
	if {[$mytypeobj bysymbol]} {
	    set detail " '[state one {
		SELECT S.name
................................................................................

    # item -> list (item)
    method nextmap {} {
	$mytypeobj successors tmp $myitems
	return [array get tmp]
    }

    method breakinternaldependencies {} {

	log write 14 csets {[$self str] BID}
	vc::tools::mem::mark
	##
	## NOTE: This method, maybe in conjunction with its caller
	##       seems to be a memory hog, especially for large
	##       changesets, with 'large' meaning to have a 'long list
	##       of items, several thousand'. Investigate where the
................................................................................

	set laste $firste
	foreach fragment [lrange $fragments 1 end] {
	    Border $fragment s e
	    integrity assert {$laste == ($s - 1)} {Bad fragment border <$laste | $s>, gap or overlap}

	    set new [$type %AUTO% $myproject $mytype $mysrcid [lrange $myitems $s $e]]


            log write 4 csets "Breaking [$self str ] @ $laste, new [$new str], cutting $breaks($laste)"

	    set laste $e
	}

	integrity assert {







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







|
>







 







>







59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
...
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
...
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
	    set key [list $cstype $iid]
	    set myitemmap($key) $self
	    lappend mytitems $key
	    log write 8 csets {MAP+ item <$key> $self = [$self str]}
	}
	return
    }

    destructor {
	# The main thing is to keep track of the itemmap and remove
	# the object from it. The lists of changesets (mychangesets,
	# mytchangesets) are not maintained (= reduced), for the
	# moment. We may be able to get rid of this entirely, at least
	# for (de)construction and pass InitCSets.

	foreach iid $myitems {
	    set key [list $mytype $iid]
	    unset myitemmap($key)
	    log write 8 csets {MAP- item <$key> $self = [$self str]}
	}
	return
    }

    method str {} {
	set str    "<"
	set detail ""
	if {[$mytypeobj bysymbol]} {
	    set detail " '[state one {
		SELECT S.name
................................................................................

    # item -> list (item)
    method nextmap {} {
	$mytypeobj successors tmp $myitems
	return [array get tmp]
    }

    method breakinternaldependencies {cv} {
	upvar 1 $cv counter
	log write 14 csets {[$self str] BID}
	vc::tools::mem::mark
	##
	## NOTE: This method, maybe in conjunction with its caller
	##       seems to be a memory hog, especially for large
	##       changesets, with 'large' meaning to have a 'long list
	##       of items, several thousand'. Investigate where the
................................................................................

	set laste $firste
	foreach fragment [lrange $fragments 1 end] {
	    Border $fragment s e
	    integrity assert {$laste == ($s - 1)} {Bad fragment border <$laste | $s>, gap or overlap}

	    set new [$type %AUTO% $myproject $mytype $mysrcid [lrange $myitems $s $e]]
	    incr counter

            log write 4 csets "Breaking [$self str ] @ $laste, new [$new str], cutting $breaks($laste)"

	    set laste $e
	}

	integrity assert {