Hello, I'm ScoopGracie.
I heard about Google Code-in by reading the Google Open Source blog. I thought it sounded cool, so I decided to participate this year.
When I looked through the orgs list, nothing really caught my attention at first. OpenMRS looked interesting, but I don't know Java. I looked for orgs that use Python, which is my favorite language. The first one I found was Apertium, so I picked that.
Apertium is a free and open-source rule-based machine translation (RBMT) system. It mostly focused on minority languages, which often lack any other machine translation systems.
The Apertium community on IRC is very welcoming and helpful. They always clarified anything I didn't understand, and answered any questions I had.I ended up doing nine tasks, which is a fairly typical amount for Apertium, although one of them was not done correctly, and one of them was actually supposed to be split into two. That leaves it at nine.
Because I only know English, I wasn't able to do a whole lot on the linguistic data. I don't know much C++, so I couldn't do much on the core translator, either. I ended up doing these tasks:
- This task was to clean up and update a page on the Apertium Wiki. I did the article on Begiak (Apertium's IRC bot)
- This task was to scrape a Website assigned by a mentor. Note: you don't need to know the language(s) the site is written in to do this!
- This was to add a new command, .whois, to Begiak. It shows some basic info about a user. Because I did both parts 1 and 2 at once, this was counted as two tasks.
- Begiak
occasionallyfrequently crashes. For this task, I needed to suggest an error reporting system. Due to a dependency on the Flask framework, it was never merged, but it worked, so the task was accepted. - Every time a user connects to IRC through Matrix, it adds [m] to the end of his/her nick. Since this is often unwanted, Begiak informs users of this. However, the message is annoying to those who want to keep the [m], so I made it optional in this task.
- This task was to take a list of 500 words in English and tag their parts of speech.
- The init script for Begiak was broken; this task was to fix it.
- This was to make Begiak accept commands case-insensitively. After this was approved, it was decided to not make commands case-insensitive.
Overall, I had an excellent experience working with Apertium and would highly recommend the organization to any future GCI students.
Comments
Post a Comment