Fossil

Check-in [7dabede3]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Oops, make it work correct now.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | disallow-invalid-utf8-in-filenames
Files: files | file ages | folders
SHA1: 7dabede3b38835601315188dd2cd2d530200f38f
User & Date: jan.nijtmans 2013-01-21 13:12:17
Context
2013-01-23
13:15
Further fine-tuning of the check for valid UTF8 characters in filenames. check-in: 4d456c9f user: drh tags: trunk
2013-01-21
13:12
Oops, make it work correct now. Closed-Leaf check-in: 7dabede3 user: jan.nijtmans tags: disallow-invalid-utf8-in-filenames
09:39
From the changes.wiki for Fossil 1.25: "Disallow invalid UTF8 characters (such as characters in the surrogate pair range) in filenames." This completes the set of UTF8 characters which are generally considered invalid, so they should be disallowed in filenames: the "overlong form", invalid continuation bytes, and -finally- noncharacters. check-in: 011d5f69 user: jan.nijtmans tags: disallow-invalid-utf8-in-filenames
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to src/file.c.

515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
...
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
        if( c&0x10 ){
          /* Unicode characters > U+FFFF are not supported.
           * Windows XP and earlier cannot handle them.
           */
          return 0;
        }
        /* This is a 3-byte UTF-8 character */
        unicode = ((c&0x0f)<<12) + ((c&0x3f)<<6) + (c&0x3f);
        if( unicode <= 0x07ff ){
          /* overlong form */
          return 0;
        }else if( unicode>=0xe000 ){
          /* U+E000..U+FFFF */
          if( (unicode<=0xf8ff) || (unicode>=0xfffe) ){
            /* U+E000..U+F8FF are for private use.
................................................................................
          }
        }else if( (unicode>=0xD800) && (unicode<=0xDFFF) ){
          /* U+D800..U+DFFF are for surrogate pairs. */
          return 0;
        }
      }
      do{
        if( (z[1]&0xc0)!=0x80 ){
          /* Invalid continuation byte (multi-byte UTF-8) */
          return 0;
        }
        /* The hi-bits of c are used to keep track of the number of expected
         * continuation-bytes, so we don't need a separate counter. */
        c<<=1; ++z;
      }while( c>=0xc0 );
    }else if( c=='\\' ){
      return 0;
    }
    if( c=='/' ){
      if( z[i+1]=='/' ) return 0;
      if( z[i+1]=='.' ){







|







 







|





|







515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
...
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
        if( c&0x10 ){
          /* Unicode characters > U+FFFF are not supported.
           * Windows XP and earlier cannot handle them.
           */
          return 0;
        }
        /* This is a 3-byte UTF-8 character */
        unicode = ((c&0x0f)<<12) + ((z[i+1]&0x3f)<<6) + (z[i+2]&0x3f);
        if( unicode <= 0x07ff ){
          /* overlong form */
          return 0;
        }else if( unicode>=0xe000 ){
          /* U+E000..U+FFFF */
          if( (unicode<=0xf8ff) || (unicode>=0xfffe) ){
            /* U+E000..U+F8FF are for private use.
................................................................................
          }
        }else if( (unicode>=0xD800) && (unicode<=0xDFFF) ){
          /* U+D800..U+DFFF are for surrogate pairs. */
          return 0;
        }
      }
      do{
        if( (z[i+1]&0xc0)!=0x80 ){
          /* Invalid continuation byte (multi-byte UTF-8) */
          return 0;
        }
        /* The hi-bits of c are used to keep track of the number of expected
         * continuation-bytes, so we don't need a separate counter. */
        c<<=1; ++i;
      }while( c>=0xc0 );
    }else if( c=='\\' ){
      return 0;
    }
    if( c=='/' ){
      if( z[i+1]=='/' ) return 0;
      if( z[i+1]=='.' ){