What is happening
Why the Bug Happens
The bug occurs in the tokenizer routine (starting at 1BC0H), which converts the input ASCII line into tokenized form by matching against the keyword table (starting at 164FH). This table includes entries for statements, operators, and special functions, where each keyword string ends with its last character ORed with 80H (making it negative in signed byte terms). For TAB(, the table entry is the bytes for "TAB(" with the final '(' | 80H (28H + 80H = A8H), corresponding to token A1H.
- The tokenizer scans the input (HL pointing to current input char) and attempts to match keywords char-by-char against the table (DE pointing to table).
- It starts by uppercasing input letters (at 1C01–1C0B if lowercase) and skipping to the start of each keyword in the table (loop at 1C0E–1C13).
- For each potential keyword, it compares the first char (table & 7FH vs. input at 1C18 B9 CP C).
- If match, it advances table (INC DE at 1C1D) and loads next table char (1C1E 1A LD A,(DE)).
- If the loaded char is positive (OR A at 1C1F, no M flag), it sets C = A (1C23), optionally skips spaces for specific tokens like 8DH (at 1C25–1C2A), advances input (INC HL at 1C2B), loads next input char and uppers it (1C2C–1C32), then compares (1C33 B9 CP C).
- If match, loop back (1C34 JR Z,1C1D).
- Crucially, if the next table char is negative (last char, like A8H for '('), it jumps to match success at 1C39 (via JP M at 1C20) without advancing input (no INC HL) and without comparing the input char to (table char & 7FH).
- This means the last character in keywords like TAB( (the '(') is not checked against the input, and input is not advanced past it. However, in practice, the parser later expects the ( to be present after the token A1H, but since the tokenizer assumes it without verification or consumption, it only matches if the structure aligns incidentally.
- When a space is present ("TAB (63)"), after matching 'B' (positive table char), it advances input to the space (INC HL at 1C2B), loads A = space (1C2C), but the loop is for previous, and the next load is the last A8H, JP M without checking the space vs. 28H. But since the previous CP was for 'B', and the space is not involved in CP, but the match succeeds without consuming or checking the '(', but the space is advanced past? No, the INC HL is for the next positive char, but for last, no INC.
- The net effect is that spaces are not skipped between keyword characters (only optionally for token 8DH at 1C29 RST 10H, which calls the space skipper at 1D78H). For "TAB (", the space after 'B' causes mismatch because the advance and compare at 1C2B–1C33 would see space != next positive table char, but since for TAB( the '(' is last (negative), the jump at 1C20 skips the check, but the logic effectively requires consecutive chars without space for the matched prefix.
- If no keyword match (after all table tries, RET Z at 1C17 pops to 1C3D), it falls back to treating the sequence as a variable name (copying alphanum chars at 1C3D onward).
- Thus, with space, no match for TAB(, so "TAB" becomes variable TAB, spaces skipped (via RST 10H elsewhere), then '(' starts array subscript parsing (handled in expression evaluator at 2337H).
- Without space, matches TAB( prefix (checks 'T''A''B', skips check for '(', but advances to '(', outputs A1H, leaves HL at '(', but parser handles it as function arg start.
This violates the "spaces have no significance" rule because the tokenizer's matching requires strict consecutivity for special chars like '('.
How to Fix It
To fix, modify the tokenizer to skip spaces before checking non-letter keyword characters (like '(' in TAB(), without allowing spaces in regular keywords.
Patch after loading the next table char (around 1C1E), if the char is non-letter (e.g., FE41 CP 'A', DA skip if <A, FE5BH CP '[' if >Z), then call RST 10H to skip spaces before CP.
Specific patch at 1C23 (LD C,A):
Replace with:
E67F AND 7FH ; C = char &7F always (for last too)
4F LD C,A
Then if original A negative (last char), set flag or something.
But to include skip for last char, insert after 1C20 JP M to a new patch area, where do the skip, INC HL, LD A,(HL) upper, CP C (C= &7F), JR NZ mismatch, then INC HL to consume, then to 1C39
Since ROM is read-only, use a RAM patch: redirect the vector or overlay.
For example, at 1C20, change JP M,1C39 to JP M, new_addr in RAM
At new_addr:
D7 RST 10H ; skip spaces, A=next non space
2B DEC HL ; back to char
23 INC HL ; advance to it? RST 10H already advanced
Adjust: since RST 10H loads A and advances past spaces
Then compare A CP C
JR NZ, mismatch (to 1C36)
Then go to 1C39
This way, for last char negative, skip spaces, get next char, cp with &7F last, if match, match.
For other negative, if any.
This fixes by skipping spaces before the '(', allowing "TAB ( " to match as TAB( , advancing past the (.
With this, spaces after the keyword but before ( are skipped, treating as function.<
The bug occurs in the tokenizer routine (starting at address 1BC0H), which converts the input ASCII line into tokenized form by matching against the keyword table (starting at address 164FH). This table includes entries for statements, operators, and special functions, where each keyword string ends with its last character ORed with 80H (making it negative in signed byte terms). For TAB(, the table entry is the bytes for "TAB(" with the final '(' | 80H (28H + 80H = A8H), corresponding to token A1H.
- The tokenizer scans the input (HL pointing to current input char) and attempts to match keywords char-by-char against the table (DE pointing to table).
- It starts by uppercasing input letters (at 1C01–1C0B if lowercase) and skipping to the start of each keyword in the table (loop at 1C0E–1C13).
- For each potential keyword, it compares the first char (table & 7FH vs. input at 1C18 B9 CP C).
- If match, it advances table (INC DE at 1C1D) and loads next table char (1C1E 1A LD A,(DE)).
- If the loaded char is positive (OR A at 1C1F, no M flag), it sets C = A (1C23), optionally skips spaces for specific tokens like 8DH (at 1C25–1C2A), advances input (INC HL at 1C2B), loads next input char and uppers it (1C2C–1C32), then compares (1C33 B9 CP C).
- If match, loop back (1C34 JR Z,1C1D).
- If the next table char is negative (last char, like A8H for '('), it jumps to match success at 1C39 (via JP M at 1C20) without advancing input (no INC HL at 1C2B) and without comparing the input char to (table char & 7FH at 1C33).
- This means the last character in keywords like TAB( (the '(') is not checked against the input, and input is not advanced past it. The parser (expression evaluator at 2337H) later expects the arg after token A1H, but the lack of check/advance means matching relies on consecutive chars.
- When a space is present ("TAB (63)"), after matching 'B' (positive), it would advance input to space (INC HL at 1C2B for next positive, but since next is negative, jump at 1C20 skips the check at 1C33, but the advance has already happened for the previous, making the pos at space, but match succeeds without verifying '('. However, the overall logic fails to match because the table expects consecutive, and the space causes the effective position mismatch in the loop flow.
- The key is no general space skip between chars (only for token 8DH at 1C29 D7 RST 10H, calling skipper at 1D78H).
- If no match, RET Z at 1C17 pops to 1C3D, treating as variable (copies alphanum at 1C3D+), then subscript ( at expression level 2337H).
- With space, no match for TAB(, treats "TAB" as variable TAB, skips space (RST 10H), parses (63) as subscript.
- Without space, matches "TAB(", outputs A1H, advances past '(', parses 63 as arg.
This violates the rule that spaces have no significance except in printed messages, as the space prevents matching the full "TAB(" keyword.
How to Fix It
To fix, modify the tokenizer to skip spaces before comparing the last keyword char (negative A at 1C20), so '(' can have spaces before it without failing match.
The specific address to patch is the JP M at 1C20 FA391C JP M,1C39H.
Change to JP M, patch_addr (e.g., in RAM by hooking, since ROM read-only; use loader to copy ROM to RAM or patch via vector at 4015H or similar).
At patch_addr (e.g., assume free RAM):
E67F AND 7FH ; C = last char &7F
4F LD C,A
D7 RST 10H ; skip spaces, A=next non-space input char, HL advanced
B9 CP C ; cp input to last char
20xx JR NZ,mismatch ; xx offset to 1C36 E1 POP HL; 18D3 JR 1C0C
C3391C JP 1C39H ; match
This adds check and skip for last char if negative: skips spaces, compares the non-space input to last &7F, if match, proceed to match.
With this, for "TAB (", after 'B', load A=A8H, JP M to patch, AND 7FH=28H, C=28H, RST 10H skips space, A='(', HL at '(', CP 28H match, JP 1C39 match, advances past '(' via RST 10H.
This fixes only for last chars (like '('), without allowing spaces in letters.
For SPC( similar token, fixed too.