Code 128 subsets, FNC1-FNC4, and the mod-103 check digit

Full reference for Code 128: subsets A, B, C; FNC1 through FNC4; the mod-103 check digit with a worked example; the FNC4 + ISO/IEC 8859-1 Latin-1 extension. Sourced against ISO/IEC 15417:2007.

Code 128 is a high-density linear barcode that encodes the full 128-character ASCII set, plus the entire ISO/IEC 8859-1 (Latin-1) repertoire by way of an extension symbol. It does this with three character subsets (A, B, and C), four function characters (FNC1 through FNC4), three latch characters, one shift character, and a mandatory mod-103 check digit. The symbology is formally specified in ISO/IEC 15417:2007. Try it live in the Code 128 generator; everything below is what the generator is doing under the hood.

This page is a reference, not a tutorial. Each section is anchored so external citations can deep-link to a single fact. If you only need one number, jump to control symbol values, FNC values per subset, or mod-103 calculation.

The three subsets

Code 128 has exactly three character subsets. A symbol stays in one subset until a latch character switches it to another, or a shift character peeks into another for a single character and reverts. There is no fourth subset and no extended mode beyond what FNC4 provides.

SubsetWhat it encodesTypical use
AASCII 00-95: control characters (NUL through US), digits, uppercase letters, and standard punctuation. No lowercase.Anything that includes ASCII control characters (tab, line feed, group separator) or is uppercase-only.
BASCII 32-127: space, digits, uppercase, lowercase, punctuation, and DEL. No control characters below space.Default subset for general printable text. The most common subset in everyday use.
CPairs of digits 00-99. Each symbol encodes two digits in one position, doubling density. No letters, no punctuation, no control characters.Long numeric runs. GTIN, SSCC, serial numbers, and most GS1 Application Identifier values.

Subset C is what makes Code 128 dense for numeric data. A 20-digit serial in Set B costs 20 data symbols; the same serial in Set C costs 10. That density is also what drives the encoder's subset-selection logic, covered under Set C length optimization. For a side-by-side density comparison against other linear codes, see Code 128 vs. Code 39 for the full Code 128 subsets reference and switching rules.

The encoder picks subsets for you. You do not insert latch or shift characters in your input string; you give the encoder bytes, and it chooses the subset sequence that minimises symbol count. That choice is deterministic and follows the rules in §9.

Start, stop, and the special symbol values

Every symbol value in Code 128 is an integer from 0 to 106. Most values map directly to a character (different character per subset). Eleven values are reserved for control: three start codes, one stop code, three latches, one shift, and the four FNC functions. The mod-103 check digit, which we cover in §7, also draws from values 0-102.

Symbol valueNameFunction
103START ABegins a symbol in subset A.
104START BBegins a symbol in subset B.
105START CBegins a symbol in subset C.
106STOPTerminates the symbol. Unique 13-module pattern (every other symbol is 11 modules).
101CODE ALatch into subset A. In Set A this same value is FNC4.
100CODE BLatch into subset B. In Set B this same value is FNC4.
99CODE CLatch into subset C.
98SHIFTOne-character shift between A and B. Not available in or to subset C.
102FNC1Application identifier flag (GS1) when first; field separator elsewhere.
97FNC2Message append. A and B only.
96FNC3Reader programming. A and B only.
101 in Set A, 100 in Set BFNC4ISO/IEC 8859-1 (Latin-1) extension. Subset A and B only. Not available in C.

The contextual reuse of values 100 and 101 is the single biggest source of confusion when reading Code 128 by hand. The same physical bar pattern means three different things depending on the current subset:

  • Value 100: FNC4 in Set B; latch to Set B (CODE B) when read from Set A or Set C. Set C has no literal pair at this value (Set C pairs run 00-99, values 0-99).
  • Value 101: FNC4 in Set A; latch to Set A (CODE A) when read from Set B or Set C. Same as above: not a literal pair in Set C.

The decoder tracks the current subset state across the whole symbol; it is not a property of any one bar pattern. This is why a Code 128 decoder cannot operate symbol-by-symbol in isolation: every value is interpreted in the context of the previous subset state.

FNC1 through FNC4: values and signaling

Code 128 has exactly four function characters. Each one occupies one symbol position and is encoded as a numeric value, just like a data character. The values are:

FunctionSet ASet BSet CDecoded as
FNC1102102102Application identifier flag, or ASCII GS (0x1D) field separator.
FNC29797not availableMessage append marker (rarely used in modern systems).
FNC39696not availableReader-programming directive (consumed by reader, not transmitted).
FNC4101100not availableLatin-1 extension shift or latch.

FNC1 is the only function character that is valid in all three subsets. FNC2, FNC3, and FNC4 are valid only in subsets A and B. This is not an arbitrary restriction: subset C is purely a numeric pair encoder, and the values 96 and 97 already mean the literal pairs 96 and 97 in C. The standard reserves no slot for FNC2/3/4 inside subset C, so the encoder must latch to A or B before emitting them.

FNC1: GS1 sentinel and field separator

FNC1 has two distinct meanings depending on its position in the symbol:

  1. Immediately after the start character: this declares the symbol to be a GS1-128 (formerly UCC/EAN-128). The data that follows is structured according to the GS1 Application Identifier system. Decoders that recognise this prefix flag the read as a GS1-formatted symbol; some transmit a leading ]C1 AIM identifier prefix to indicate the GS1 mode.
  2. Anywhere else in the symbol: FNC1 acts as a field separator between two adjacent variable-length AI fields. Decoders typically transmit it as ASCII GS (Group Separator, 0x1D, decimal 29) so application software can split the data stream cleanly.

The GS1 General Specifications §7 enumerates which Application Identifiers are fixed-length (no separator needed) and which are variable-length (FNC1 separator required when followed by another AI). For the practical difference between plain Code 128 and a GS1-128 carrying AIs, see GS1-128 vs. Code 128.

FNC2: message append

FNC2 instructs a reader to buffer the current symbol's data and concatenate it with the data of the next symbol read. Two physically separate Code 128 symbols can be presented as a single logical message to the host. FNC2 is only meaningful in symbol position one (immediately after the start character). It is rare in modern deployments; most concatenation is now handled at the application layer.

FNC3: reader programming

FNC3 marks the symbol as a reader-configuration command rather than data. The decoder consumes the symbol locally (changes a setting, switches a mode, sets a prefix or suffix) and does not transmit any payload to the host. Vendor-specific configuration sheets distributed by scanner manufacturers use FNC3 extensively. Like FNC2, FNC3 must appear in the first data position to take effect.

FNC4: Latin-1 extension

FNC4 is the mechanism by which Code 128 carries characters above ASCII 127, specifically the ISO/IEC 8859-1 (Latin-1) repertoire (values 128-255). Every claim in this section traces back to ISO/IEC 15417:2007 §4.4.4 and the relevant clauses of the GS1 General Specifications.

FNC4 has two operating modes: a single-character shift and a two-symbol latch. Both modes apply only inside subset A or subset B; FNC4 is not legal inside subset C and must not appear in a Set C run.

Single-shift mode (one FNC4)

A single FNC4, followed by exactly one data symbol, decodes the next symbol's value as a Latin-1 character at value + 128. After that one character, the decoder reverts to plain ASCII for the next symbol. The shift consumes one symbol position (the FNC4 itself) and applies to exactly one following data symbol.

Worked example. To encode the Latin-1 character é (e-acute, Latin-1 value 233 = 0x E9) inside a Set B run:

Stream:      [data...]  FNC4(100)  i(value 73)  [data...]
Decoded:     [data...]  é (decoded char 105 + 128 = 233)  [data...]

Set B symbol value 73 is the lowercase letter i (ASCII 105). With FNC4 in effect, the decoder adds 128 to the decoded character value, yielding 233, which is é in Latin-1. The FNC4 only modifies the single immediately-following symbol; the next symbol after that decodes normally.

Latch mode (two consecutive FNC4)

Two consecutive FNC4 symbols latch the decoder into Latin-1 mode. Every following data symbol in the current subset has +128 applied to its decoded value, until either:

  • a single FNC4 appears, which acts as a one-character revert to plain ASCII (the next symbol decodes as plain Set A or B; the symbol after that resumes Latin-1); or
  • two consecutive FNC4 symbols appear, which un-latch back to plain ASCII for all following symbols; or
  • a CODE A, CODE B, or CODE C latch occurs, which switches subset and clears the Latin-1 latch.

The latch is the only practical way to encode a long run of Latin-1 text. Without it, every Latin-1 character would cost two symbols (one FNC4 plus the data symbol), doubling the symbol count for non-ASCII text.

FNC4 is precisely the mechanism that lets Code 128 carry the full ISO/IEC 8859-1 repertoire. The GS1 General Specifications §5.3.1.1 forbids FNC4 inside GS1-128 symbols, on the grounds that GS1 Application Identifier data is restricted to invariant printable ASCII; FNC4 is therefore a feature of plain Code 128, not GS1-128.

Complete value-to-character mapping

This is the canonical Code 128 character set: 107 symbol values (0-106), each interpreted differently depending on the active subset. Set A and Set B agree on values 0-63 (the printable ASCII range from space through underscore). They diverge from value 64 onward, where Set A continues into ASCII control characters and Set B continues into lowercase plus DEL. Set C is purely numeric pairs.

ValueSet ASet BSet C
0SPSP00
1-15! " # $ % & ' ( ) * + , - . /same as A01-15
16-250-90-916-25
26-31: ; < = > ?same as A26-31
32@@32
33-58A-ZA-Z33-58
59-63[ \ ] ^ _same as A59-63
64NUL (ASCII 0)` (backtick)64
65-90SOH-SUB (ASCII 1-26)a-z65-90
91-95ESC FS GS RS US (ASCII 27-31){ | } ~ DEL91-95
96FNC3FNC396
97FNC2FNC297
98Shift BShift A98
99CODE CCODE C99
100CODE BFNC4CODE B
101FNC4CODE ACODE A
102FNC1FNC1FNC1
103START A (only valid as first symbol)
104START B (only valid as first symbol)
105START C (only valid as first symbol)
106STOP (13-module pattern; only valid as last symbol)

The bar/space pattern for each value (three bars and three spaces, summing to 11 modules) is identical across subsets. The subset only changes the character interpretation, never the printed pattern. The pattern table itself is reproduced in full at Wikipedia: Code 128; this article does not duplicate it.

Subset switching: shift vs. latch

There are two ways to step out of the current subset. They are not interchangeable.

SHIFT (value 98): a one-character peek. SHIFT exists only between subsets A and B. In Set A, value 98 means "interpret the next symbol from Set B, then return to Set A". In Set B, value 98 means "interpret the next symbol from Set A, then return to Set B". SHIFT is not legal in Set C and cannot reach Set C.

CODE A / CODE B / CODE C (values 101, 100, 99): a permanent latch. The decoder switches subset and stays there until another latch occurs. Latches between any two subsets are legal in either direction. The latch consumes one symbol position.

Use SHIFT when exactly one character requires the other subset and the surrounding text is clearly in the current subset. Use a latch when more than one character belongs in the other subset. The encoder calculates the breakeven automatically; for a hand-encoded payload, the rough rule is that two or more consecutive characters in the foreign subset are cheaper as a latch (two latches = two symbols) than as two shifts (two symbols too, but with the overhead of repeated context switches that complicate the symbol).

Worked example. Encoding the literal payload ABC123abc:

Tokens:   START A  A  B  C  CODE C  12  CODE B  a  b  c
Values:   103      33 34 35 99      12  100     65 66 67   [check]   STOP

Wait, that does not work; let me restate. Set A's A is value 33, but only because Set A's value 33 maps to A. Set B's a is value 65. Set C's pair 12 is value 12. The encoder selects START B instead, because Set B handles both ABC (uppercase) and abc (lowercase) without a latch, and the only digit run 123 is too short for Set C to pay off (three digits cost more in C than in B because of the latch overhead). The optimal sequence is therefore:

Tokens:   START B  A   B   C   1   2   3   a   b   c   [check]  STOP
Values:   104      33  34  35  17  18  19  65  66  67  [check]  106

One start, nine data symbols, one check, one stop = twelve symbols. Compare with the naive Set-A/Set-C/Set-B sequence above, which would cost three latches plus extra symbols.

The mod-103 check digit

Every Code 128 symbol carries a mandatory check digit. It is calculated over the start character and all data symbols (including any FNC, SHIFT, and CODE characters), and it sits between the last data symbol and the STOP. The check digit is one symbol value (0-102), encoded with the same bar pattern that the corresponding character would use in the current subset at that position.

Formula

check = (StartValue + sum_over_i(SymbolValue[i] * i)) mod 103
        where i = 1, 2, 3, ... over each data symbol in order

The start character's value contributes with weight 1 (no multiplier). The first data symbol contributes with weight 1, the second with weight 2, and so on. FNC, SHIFT, and CODE characters contribute their symbol values with the same positional weight as any data character. The result, taken modulo 103, is the check symbol value.

Worked example: payload PJJ123C

The encoder selects START B (value 104) because the payload is mixed uppercase and digits, and the digit run 123 is too short to amortise a CODE C latch. The data symbols and their Set B values:

Position iSymbolSet B valuei × value
(start)START B104104
1P4848
2J4284
3J42126
411768
521890
6319114
7C35245

Sum: 104 + 48 + 84 + 126 + 68 + 90 + 114 + 245 = 879.

Check digit: 879 mod 103. Since 103 × 8 = 824 and 879 - 824 = 55, the check digit is 55.

Final symbol stream:

START B(104)  P(48)  J(42)  J(42)  1(17)  2(18)  3(19)  C(35)  [check=55]  STOP(106)

Converting the remainder to a symbol pattern

The check digit is a number from 0 to 102. To convert it back to the bar/space pattern that gets printed, look it up in the Code 128 character set table by symbol value. There is no separate "check character set"; the check digit uses the same value-to-pattern mapping as any other symbol. In the example above, value 55 in Set B is the character W; the printed bar pattern for the check position is exactly the bar pattern of W.

The check digit's value is interpreted in the subset that is active at its position, which is the same subset the last data symbol was in. If the symbol latched to Set C immediately before the check digit, the check digit is rendered with Set C's pattern for that value. The visual character may be a digit pair, a letter, a control character, or a function character; only the bar pattern matters for verification.

Worked example with Set C: 00102030405060

A 14-digit numeric payload with even length, all digits, fits cleanly in Set C from start to finish. The encoder selects START C (value 105). Each pair of digits is one symbol value: 00=0, 10=10, 20=20, 30=30, 40=40, 50=50, 60=60.

Position iPairSet C valuei × value
(start)START C105105
10000
2101020
3202060
43030120
54040200
65050300
76060420

Sum: 105 + 0 + 20 + 60 + 120 + 200 + 300 + 420 = 1225.

Check: 1225 mod 103. 103 × 11 = 1133; 1225 - 1133 = 92. Check digit value 92.

Set C value 92 is the literal pair 92. The check position prints with the bar pattern for value 92, and a decoder transmits no character for it (the check digit is consumed by the decoder, not transmitted to the host). Final symbol stream: nine symbols total (1 start + 7 data + 1 check + 1 stop), encoding 14 digits in a Set B equivalent of 16 symbols. The density saving is 7 symbols, not counting the latch overhead Set B would not need here.

Worked example with FNC4 latched Latin-1: café au lait

The string café au lait contains one Latin-1 character (é, value 233) embedded in lowercase ASCII. There are two ways to encode it; the encoder picks the cheaper one.

Option 1: single FNC4 shift. Stay in Set B, emit FNC4 immediately before é, then continue. Cost: one extra symbol (the FNC4) for one Latin-1 character. Total: 11 data symbols + 1 FNC4 = 12 data symbols.

Option 2: FNC4 latch, then unlatch. Two FNC4s to enter Latin-1 mode, then two FNC4s to leave it. Cost: four extra symbols. Worse than option 1 for a single Latin-1 character. The encoder uses option 1.

The latch is only profitable when three or more consecutive Latin-1 characters appear (because two FNC4 enter + two FNC4 exit = four overhead symbols, breakeven at three single shifts vs. one latch run). For a long Latin-1 string, the latch is essential; for one stray accent, single shift wins.

AIM identifier prefixes

Code 128 decoders that support the AIM Symbology Identifier transmission format prepend a three-character prefix to the data stream that identifies the symbology and any in-band variants. For Code 128, the prefix is ]C followed by a single-digit modifier:

PrefixMeaning
]C0Standard Code 128 (no FNC1 in first position).
]C1GS1-128 (FNC1 in first position).
]C2FNC1 in second position. AIM CC-A/CC-B Composite Component flag.
]C4Concatenated GS1-128, used with the FNC2 message-append variant.

The prefix is not part of the symbol's data; it is generated by the decoder from the symbol's structure. Hosts that need to know whether a Code 128 symbol carries GS1-formatted data should rely on this prefix rather than scanning the data stream for AIs.

Total symbol length

Every Code 128 data character (and start, latch, shift, FNC, and check) is exactly 11 modules wide, encoded as three bars and three spaces in a fixed widths-pattern. The STOP is 13 modules: three bars and three spaces plus a final two-module bar that closes the symbol.

Total modules = 11 * (data_symbols + start + check) + 13_for_stop
              = 11 * (n + 2) + 13
              = 11n + 35
  where n = number of data symbols (excluding start, check, and stop)

Add quiet zones of at least ten modules on each side. The minimum X-dimension is application-specific; for retail and shipping labels, see how to print barcode labels that scan for X-dimension and quiet-zone targets.

X-dimension and printed width

The X-dimension is the width of the narrowest module, measured in the printed symbol. Code 128 has no fixed X-dimension; the standard recommends a range of 0.250 mm to 1.016 mm for general use. GS1-128 retail and logistics labels typically use 0.495 mm (about 19.5 mil) to 0.940 mm (37 mil), with 0.495 mm the floor for SSCC-18 shipping labels per GS1 General Specifications §5.4.4. At smaller X-dimensions the symbol is denser but more sensitive to print quality and scanner resolution; at larger X-dimensions the symbol takes more horizontal space but is more forgiving.

The printed width in millimetres is therefore X × (11n + 35) plus quiet zones. A 12-character Set B payload (n=12) at X = 0.5 mm prints as roughly 12 × 11 + 35 = 167 modules × 0.5 mm = 83.5 mm, plus 5 mm of quiet zone on each side = 93.5 mm total. This is the budget you have when laying out a label; if it does not fit, either drop the X-dimension or shorten the payload.

Print verification

Code 128 has no error correction. The mod-103 check digit catches single-symbol errors but cannot recover from a misread. Print quality is therefore the dominant factor in scan reliability. The applicable verification standard is ISO/IEC 15416 for linear barcode print quality, which assigns a letter grade A-F based on edge contrast, modulation, defects, decode margin, decodability, minimum reflectance, and symbol contrast. Most retail and logistics applications require grade C (1.5) or better; pharmaceutical and medical-device labelling under GS1 Healthcare typically requires grade C (1.5) measured at 660 nm aperture.

Set C length optimization

The encoder switches into Set C when a long enough run of digits makes the density savings outweigh the cost of the CODE C latch. The exact breakeven depends on whether Set C is at the start of the symbol (where it costs only the START C character, no latch) and whether the digit run has even length (Set C requires digit pairs).

The GS1 General Specifications §5.4.7.7 give the canonical optimisation rules. In summary:

  • If the entire symbol is digits and the count is even, use START C and stay in C for the whole symbol.
  • If the symbol begins with four or more digits, start in C; otherwise start in A or B.
  • Inside an A or B run, switch to C when six or more consecutive digits remain (four if at end-of-data, because there is no latch back).
  • If the digit run has odd length, the last digit is encoded in A or B with a CODE A or CODE B latch.

These rules are what "the encoder picks subsets for you" actually means. They are deterministic, not heuristic; two ISO-conformant encoders given the same input must produce the same symbol-level output (modulo trailing latch position when ties exist).

GS1-128 = Code 128 + FNC1 first

GS1-128 is not a separate symbology. It is plain Code 128 with one extra rule: FNC1 must appear in the first data position. That single FNC1 declares the symbol to be a GS1-128 carrying Application Identifier data. Everything in this article applies to GS1-128 symbols equally, with two adjustments:

  • FNC4 (and therefore the Latin-1 extension) is forbidden in GS1-128. AI data is restricted to invariant printable ASCII.
  • FNC1 in non-first positions inside a GS1-128 acts as the field separator between adjacent variable-length AI fields. Decoders transmit it as ASCII GS (0x1D).

For the practical encoding differences and a payload-by-payload comparison, see the GS1-128 generator and GS1-128 vs. Code 128.

Common mistakes

  1. Manually inserting FNC, SHIFT, or CODE characters into your input. The encoder does this for you. Inserting them yourself produces a symbol whose data stream contains literal bytes that look like control characters, which breaks decoders that interpret them and confuses ones that do not.
  2. Mixing subsets when you do not need to. If your data is all uppercase plus digits, Set B handles it. There is no benefit to forcing Set A unless you actually have ASCII control characters below space.
  3. Omitting the check digit. Code 128 has no error correction; the check digit is the only safety net. A symbol without it will not decode reliably and is non-conformant.
  4. Trying to use Set C with an odd-length digit run. Set C encodes pairs. The encoder handles odd runs by encoding the last digit in A or B; do not try to pad with a leading or trailing zero unless that zero is part of your real data.
  5. Using FNC4 inside Set C, or inside a GS1-128. FNC4 is not legal in Set C, and it is forbidden in GS1-128 by GS1 §5.3.1.1. The encoder will reject this combination.
  6. Hand-calculating the check digit and forgetting that the start character contributes with weight 1. The start character is included in the weighted sum. Skipping it produces a check digit that is off by exactly the start character's value, which is silently wrong.
  7. Quiet zones too narrow. Code 128 needs at least ten modules of quiet zone on each side. Quiet-zone violations are the single most common cause of "the barcode looks fine but does not scan". For diagnosis, see why won't my barcode scan.

References

Printing thousands of Code 128 labels for cartons, totes, or asset tags? See /batch/code-128 for CSV-driven bulk generation, and Code 39 vs. Code 93 vs. Code 128 if you are still picking the symbology.

Last verified against ISO/IEC 15417:2007 and GS1 General Specifications, Release 24.0 (Jan 2026), on 2026-05-01.