I couldn't figure out how to do this using Neonnaut’s gloss generator, so I wrote a simple Python script for glossing multi-line texts. Each line of the text must have exactly four lines: original, romanization, gloss and translation. I'm sure you can figure out which lines to comment out if you don't need them all.
Code: Select all
#!/usr/bin/env python3
import sys
import io
# must have UTF-8
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
def batch_glosses(input):
lines = input.split('\n')
glosses = []
current_gloss = []
# Split input separated by blank lines
for line in lines:
if line.strip() == '':
if len(current_gloss) > 0:
glosses.append(current_gloss)
current_gloss = []
else:
current_gloss.append(line)
if len(current_gloss) > 0:
glosses.append(current_gloss)
# Loop through each gloss
results = []
for gloss in glosses:
if len(gloss) < 4:
results.append('[color=red]Each gloss needs 4 lines[/color]\n')
continue
original = gloss[0]
romanization = gloss[1]
gloss_line = gloss[2]
translation = gloss[3]
# Split
rom_words = romanization.split()
gloss_words = gloss_line.split()
#gloss tags
tagged_rom_parts = []
for i, rom_word in enumerate(rom_words):
gloss_word = gloss_words[i] if i < len(gloss_words) else ''
tagged_rom_parts.append(f'[gloss={gloss_word}]{rom_word}[/gloss]')
tagged_rom=' '.join(tagged_rom_parts)
quoted = translation if translation.startswith('"') else f'"{translation}"'
results.append(f'{original}\n{tagged_rom}\n{quoted}')
return '\n\n'.join(results)
def main():
if len(sys.argv) < 2:
print('Specify the input file')
sys.exit(1)
file = sys.argv[1]
try:
with open(file, 'r', encoding='utf-8') as f:
input = f.read()
output = batch_glosses(input)
print(output)
except FileNotFoundError:
print(f'Input file not found', file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f'Error reading file: {str(e)}', file=sys.stderr)
sys.exit(1)
if __name__ == '__main__':
main()
রাস্তায় গাড়িঘোড়ার বিরাম নাই,
rasta-i gaɽi-ɡʱoɽa-r biram nai
road.loc car-horse.gen rest is.none
On the road, the cars and horses have no rest
ফেরিওয়ালা অবিশ্রাম হাঁকিয়া চলিয়াছে,
pʰeriwala ɔbisram hãkija tʃolijatʃʰe
hawker no-rest call.part walk.3.perf.cont
the street hawkers keep walking, calling without rest,
যাহারা আপিসে কালেজে আদালতে যাইবে
dʒahara ɔpiʃ-e kɔledʒ-e ad̪alɔt-e dʒaib-e
those office.loc college.loc court.loc go.3.fut
those who will go to offices, colleges, law courts,
Corresponding output:
Code: Select all
রাস্তায় গাড়িঘোড়ার বিরাম নাই,
[gloss=road.loc]rasta-i[/gloss] [gloss=car-horse.gen]gaɽi-ɡʱoɽa-r[/gloss] [gloss=rest]biram[/gloss] [gloss=is.none]nai[/gloss]
"On the road, the cars and horses have no rest"
ফেরিওয়ালা অবিশ্রাম হাঁকিয়া চলিয়াছে,
[gloss=hawker]pʰeriwala[/gloss] [gloss=no-rest]ɔbisram[/gloss] [gloss=call.part]hãkija[/gloss] [gloss=walk.3.perf.cont]tʃolijatʃʰe[/gloss]
"the street hawkers keep walking, calling without rest,"
যাহারা আপিসে কালেজে আদালতে যাইবে
[gloss=those]dʒahara[/gloss] [gloss=office.loc]ɔpiʃ-e[/gloss] [gloss=college.loc]kɔledʒ-e[/gloss] [gloss=court.loc]ad̪alɔt-e[/gloss] [gloss=go.3.fut]dʒaib-e[/gloss]
"those who will go to offices, colleges, law courts,"রাস্তায় গাড়িঘোড়ার বিরাম নাই,
- rasta-i
- road.loc
- gaɽi-ɡʱoɽa-r
- car-horse.gen
- biram
- rest
- nai
- is.none
"On the road, the cars and horses have no rest"
ফেরিওয়ালা অবিশ্রাম হাঁকিয়া চলিয়াছে,
- pʰeriwala
- hawker
- ɔbisram
- no-rest
- hãkija
- call.part
- tʃolijatʃʰe
- walk.3.perf.cont
"the street hawkers keep walking, calling without rest,"
যাহারা আপিসে কালেজে আদালতে যাইবে
- dʒahara
- those
- ɔpiʃ-e
- office.loc
- kɔledʒ-e
- college.loc
- ad̪alɔt-e
- court.loc
- dʒaib-e
- go.3.fut
"those who will go to offices, colleges, law courts,"
---
If you use this tool, consider contributing to any of my conlang or natlang threads. Thank you.
Edit: I told a custom SLM to convert this script into a webpage: https://drive.google.com/file/d/1Yr2QnO ... sp=sharing I gave the SLM some inputs to test the script. It should be correct. Download the file and open it in your browser. (The SLM runs on my local machine, not a data center. If you are opposed to AIs in any capacity whatsoever regardless of environmental impact, then stick to the Python script above.)