Back to Silas S. Brown's home page
Pinyin Web & EPUB and related apps
I maintained these Android applications under the pseudonym "Referenced Expressions" between 2014 and 2024, and then a Google Play policy change enforced the use of real names for individual developers going into 2025.My reasons for using a pseudonym had been twofold. After having used material from a certain religious website in developing and testing the app, I felt I should link that website prominently from the app's start page. But I didn't want to give anyone the impression that I was trying to take credit for that site myself, plus I had an ill relative who might have been vulnerable to heart attack if he found out at the wrong moment that I was linking to a religious site, and for both of these reasons I wanted to avoid putting my name on the app. In my case the second reason is no longer present (hence I could let my name be used on a paper I helped write about their digitalisation in the pandemic), and the first reason is rendered moot by new policy (I didn't want to put my name there, I had to), plus the numbers of that site's readers using my app dropped off when the site implemented more browsable pinyin of its own (so the issue had become quantitatively smaller) so I felt I could now "unmask" this project. I don't know if any other developer had to delete their work rather than be unmasked.
Google Play Store listings
Since I saw an elderly lady still using the app on an eight-year-old Galaxy S2 phone in 2019, I decided to minimise the size of my apps by keeping the different topolects in different apps and by removing some of the more obscure words from the dataset for the Google Play versions (I'm assuming for example that users outside of China will not require so many mainland place names). The following are available on Play Store:- Pinyin Web & EPUB (generally good-quality pinyin as far as automatic pinyin goes; should recognise when a site is providing its own pinyin and defer to it; paragraph audio supported on some devices)
- Cantonese Web & EPUB (generally good-quality Cantonese as far as automatic annotation goes; choice of Yale, Sidney Lau or Jyutping; isolated word audio via Gradint server + paragraph audio supported on some devices)
- Teochew Web & EPUB (I'm told the Teochew annotation is OK although I wasn't completely sure of my data sources)
- Wenzhou Web & EPUB (beware this one is not as good at choosing the correct reading for ambiguous characters in as many cases as the above apps)
- Fuqing Web & EPUB (highly experimental annotator which I'm told is not very good and my test users have not yet been able to provide specific details for improvement)
- Thai Romanising Browser (this one did not work out so well, it's not maintained and I should probably remove it)
Side-loading links
If you are not able to use the Play Store, you may download the same APK files from here, but you'll need to enable your phone's "Unknown sources" setting and it won't update automatically.- Pinyin Web & EPUB side-loading APK
- Cantonese Web & EPUB side-loading APK
- Teochew Web & EPUB side-loading APK
- Wenzhou Web & EPUB side-loading APK
- Fuqing Web & EPUB side-loading APK
Huawei AppGallery version
In 2019 US legislation forced Google to ban Huawei from using the Play Store and consequently there was a period when Huawei sold Android phones in the UK without the Play Store, some of which ended up with low-end customers of H3G (Three), at least one of whom wanted to install my app without side-loading, so as a service to these, and to other Huawei users, I applied to put my app on the Huawei AppGallery as well.This version merges multiple annotators, since I assumed app size was not an issue on the 2019 phones and I really didn't want to go through the process of setting up multiple different apps again. It also includes the mainland place names I withheld from the Google Play version for app size reasons.
I wanted to give it a different variation of the app name in case anyone had both installed on the same device, but the checkers didn't like me calling it "Huawei edition" or "AppGallery edition" and they weren't even sure that "merged edition" might be a promotional suffix, so in the end we went for "Pinyin Web & EPUB and dialects" (I wanted to say "topolects" but that would have taken me over the 30-character limit).
This version also lacks the link to the religious site which is blocked in China. Nevertheless by the end of 2022 regulations had been tightened and Huawei was obliged to say all apps require an ICP license to be allowed in China, whether they link to religious sites or not, and ICP licenses are not available to people without Chinese citizenship (unless you want to pay the price of a house for some company to do it for you, or burden a Chinese friend with the risk of legally vouching for your code), so Huawei made my app available in every country except China. The AppGallery will now pretend it doesn't exist if you try to search for it inside China, and I think people there who got it before 2023 are not receiving updates (and Huawei regulations didn't allow my own update checks, so this version doesn't prompt you to update if it's more than a year old as the Google Play version does).
Here is the AppGallery link for all countries except China.
Browser extensions
These are switchable between Pinyin, Cantonese, Teochew and Wenzhou.- Mozilla Firefox "Pinyin Web" add-on (desktop or mobile)
- Google Chrome "Pinyin Web" extension (desktop)
(yes, this was the one I deliberately broke at the Oxford China Forum to show "AI" has limits)
If you're stuck on an old Windows PC with no Internet:
- Download Pinyin-Clipboard or Cantonese-Clipboard, transfer to the PC via removable media or whatever and unpack
- Put the text you want on the clipboard and run. No "installation" necessary, just run the EXEs.
- On Windows 7+, click the small "More options" link to reveal the "Run anyway" option. You should have to do this only once. Sorry I haven't paid Microsoft to be a "known publisher" to make this warning go away.
- Please update manually from time to time as there is no auto update with these.
iOS
I have not been able to port my main app to iOS or iPadOS because it is fundamentally an extended Web browser and Apple has strict policies about third-party Web browsers so I'm not confident I'll pass their test and I don't think it's worth paying to try. (I was already rejected by Amazon on the grounds that they didn't want any Web browser in their store other than their Silk browser, so I really didn't think it was worth the hassle and expense of trying for iOS.)However, other iOS developers have managed to incorporate my code and/or data into iOS apps that are not Web browsers. But some of them had to charge money to cover Apple's ongoing developer fees and the costs of keeping their Mac hardware sufficiently current for development (it's far more expensive to be an Apple developer than to be a Google Play developer, so you won't find so many "free with no ads" hobbyist public service apps on Apple). Anyway their apps include:
- Matthew Delmarter's Equipd Bible and ServicePlanner apps (both paid) use a version of my annotator code and data for Mandarin, Cantonese and Japanese,
- Jon Hargett's 3lines.org app (gratis) uses some of my data for Mandarin and Cantonese, although it's not always up-to-date (this can sometimes be mitigated by manually refreshing the dictionary), and it's not able to use my code but I was able to generate weightings to kludge its own code into giving a nearly-right result,
- Michael Buen's "Chinese Words Separator" extension (paid on Safari, gratis on Chrome) now at least includes words from CedPane, and Pleco with paid Flashcards component can also import CedPane and use it in its Reader, which is at least something even if it can't run the rest of my code.
Online version
I'm no longer running a full domain-rewriting proxy with this functionality (at least not on my public server), after some legal trouble in 2013 when someone mistook it for a VPN in a Russian religious extremism trial, but I still have a CGI that lets you paste in your own text and it can set up bookmarklets to annotate pages if you're adventurous. Since this requires sending your text to my server, I suggest you use it only as a last resort.Source material
My code to compile annotators from examples is Annotator Generator which I've liberally licensed and also written up in an Overload paper, but the corpus I use is not in a publishable state. It is made up of:- the 1990/91 PH Corpus, with some manual corrections by me (which I'm not sure I can republish),
- my CedPane project (both the main file and the auxiliary gloss file), plus some of the unpublished entries that I'm insufficiently sure qualify for CedPane but think they're still acceptable for my own app,
- some extra gloss data from a friend's text-annotation project I had permission to use in my own apps,
- fallback single-character readings from the Unihan database with some manual corrections by me to get the most preferred reading,
- a bunch of extra example sentences I've put in as and when I've found it needs more help to get a particular case right,
- and quite a bit of Chinese-with-Pinyin text from JW publications with some normalisations and a few dozen (rare) typos corrected by myself (I used a UK legal exemption for responsible private download for non-commercial text mining but not to republish).
I don't mean to offend anyone but LDS websites were not able to provide me with good-quality Chinese-with-Pinyin publications in a format I could parse with correct word grouping, whereas JW.org could. It might make Mormons feel better to know that I did use an LDS Quad for Japanese data when making the Japanese annotator for Matthew Delmarter's apps. I just couldn't find decent Chinese data from that source, sorry.
Copyright and Trademarks
All material © Silas S. Brown unless otherwise stated.Android is a trademark of Google LLC.
Apple is a trademark of Apple Inc.
Firefox is a registered trademark of The Mozilla Foundation.
Google is a trademark of Google LLC.
Google Play is a trademark of Google LLC.
H3G is a trademark of Hutchison Whampoa Enterprises Limited.
Huawei is a trademark of Huawei Technologies Co., Ltd registered in China and other countries.
JW.org is a trademark of Watch Tower Bible and Tract Society of Pennsylvania.
LDS is possibly registered as a trademark in some countries by Intellectual Reserve Inc (owned by the Corporation of the President of The Church of Jesus Christ of Latter-day Saints) but I was unable to find which countries.
Mac is a trademark of Apple Inc.
Microsoft is a registered trademark of Microsoft Corp.
Mormon is registered as a trademark in Europe held by Intellectual Reserve Inc. which is owned by the Corporation of the President of The Church of Jesus Christ of Latter-day Saints.
Mozilla is a registered trademark of The Mozilla Foundation.
Safari is a registered trademark of Apple Inc.
Windows is a registered trademark of Microsoft Corp.
Any other trademarks I mentioned without realising are trademarks of their respective holders.