This would be a perfect fit for a LLM (albeit a bit complicated by byte-pair tokenization and decoder models rather than bidirectional). You can clamp the grille words, and just generate samples with those as requirements until they reach a good likelihood and are highly fluent and indistinguishable from non-message carrying text.
No, it's the other ones that are the problem, the ones you're trying to find. You need to maintain a fixed width of characters, and while you know what characters the secret words tokenize from, and so there's no problem there (you just record the # of characters, turn them into tokens, and you're fine), all of the possible chaff text could have the wrong character size. You sample a possible text of tokens, and the likelihood is good and it reads fluently and the secret words don't stick out, but then the first line has 15 characters too many, the second has 5 too few, and so on, simply because of the vagaries of BPEs.
(And you can't replace the grill positions with relative indexes like 'the ith, mth, and nth tokens' because the tokenizations will all change... Maybe a more complicated relative index like 'whitespace separated character-ranges'... Well, whatever you do, it'll be annoying when it's a BPE LLM instead of a character-level model.)
This would be a perfect fit for a LLM (albeit a bit complicated by byte-pair tokenization and decoder models rather than bidirectional). You can clamp the grille words, and just generate samples with those as requirements until they reach a good likelihood and are highly fluent and indistinguishable from non-message carrying text.
That reminds me of this suckerpinch/tom7 video: https://www.youtube.com/watch?v=Y65FRxE7uMc
> (albeit a bit complicated by byte-pair tokenization and decoder models rather than bidirectional)
You can clamp tokens (instead of letters / words) in the grille, I guess?
No, it's the other ones that are the problem, the ones you're trying to find. You need to maintain a fixed width of characters, and while you know what characters the secret words tokenize from, and so there's no problem there (you just record the # of characters, turn them into tokens, and you're fine), all of the possible chaff text could have the wrong character size. You sample a possible text of tokens, and the likelihood is good and it reads fluently and the secret words don't stick out, but then the first line has 15 characters too many, the second has 5 too few, and so on, simply because of the vagaries of BPEs.
(And you can't replace the grill positions with relative indexes like 'the ith, mth, and nth tokens' because the tokenizations will all change... Maybe a more complicated relative index like 'whitespace separated character-ranges'... Well, whatever you do, it'll be annoying when it's a BPE LLM instead of a character-level model.)
Even more impressive is the lecture of the life of Cardano himself.
“Cars…
Cars on…
Carson City.” -John Cusack in Con Air