2020/08/08

Gitlab + letsencrypt + Private network

Requirements

This blog requires you to have following items:
  • Gitlab server
  • TLS from let’s encrypt
  • Private network, which outside world cannot reach directly
  • You own a public domain name
If your gitlab server is reachable from outside world, you can reference manual here for setup let’s encrypt.
For poor souls like me, stay tune, and keep reading.

What is let’s encrypt

For detail, please read their website yourself. To me, it is a service for me to have a valid TLS certificate but free of charge.

Gitlab server + letsencrypt

According to my experiments, till Gitlab 12.9.2, Gitlab does not support requesting certificate via DNS challenge. So, you can just disable the let’s encrypt configuration provided by gitlab if your gitlab server is not public at all.
Surprisingly, we can still benefits from let’s encrypt since it will still offer you a TLS certificate, and private key, even though you need to do the verification every 3 months.
In other words, once you have certificate, and private key, you should configure your gitlab to read them as if you are using traditional TLS certificate, which bought from some vendors.

Request certificate from letsencrypt via certbot

Firstly, you should get a certbot by following let’s encrypt manual here in your gitlab server host.
Then, you can run following command to request a certificate from letsencrypt with DNS challenge for verifications.
certbot -d YOUR_GITLAB_SERVER_DOMAIN_NAME --manual --preferred-challenges dns certonly
Follow instructions there, you will finally have a DNS challenge string. Setup a TXT record with that DNS challenge string in your public DNS.
Below is a picture I capture from Google to illustrate the txt record setup.
Once you have complete the DNS challenge, you should get a certificate, and private key as stated in the command output. Personally, I suggest you to write down those important notes generated into a file for future reference.

Configure gitlab server to read certificate from gitlab server

By default, you should have following default configurations in /etc/gitlab/gitlab.rb
# nginx['ssl_certificate'] = "/etc/gitlab/ssl/#{node['fqdn']}.crt"
# nginx['ssl_certificate_key'] = "/etc/gitlab/ssl/#{node['fqdn']}.key"
You can overwrite them with value you like. For me, I just stick to the default configuration. Therefore, I copy the certificate, and private key to the destination above.
It is worth to point out that you should name your certificate as YOUR_DOMAIN_NAME.crt, and private key as YOUR_DOMAIN_NAME.key if your are using default configuration like mine.
Finally, you can restart your gitlab server, and checkout your new certificate by reading them in your browser.

Probability speaking: How hard to get 20 wins on No tilt in Clash Royale

Recently, I have read a post from a forum which calculate the probability of getting 20 wins in No tilt in Clash Royale. IMO, this is a clever, and objective way to tell others that how hard, and how valuable to get 20 wins from No tilt!
For example, even you are top players with 80% win rate, you still only get 15.45% chance for having 20 wins. For modest players with 60% win rate, you only have 0.16% for having 20 wins!
Since the author does not expose the formula for the calculation, I would like to find it out here due to curiosity.
For Chinese readers, you can take a look the spreadsheet I wrote here

Facts

Before the calculations, we need to know following facts:
  • At most 20 wins
  • In any time, you can have at most 3 loses
  • You always start from 0 wins

Calculation

I strongly recommend you to cross check with the spreadsheet while reading below steps.

Step 0: Denote symbols

For brevity, we denote following symbols
W: Win rate. Chance to win a match
L: Lost rate. Chance to lose a tournament (IE 3 loses)
X: Number of wins in a tournament
N: Number of matches

Step 1: Define the win rate

To determine the probability for you to get 20 wins, we need the win rate of yourself. Generally, you can pick this up from your player’s statistic.

Step 2: Calculate the lose rate

L = (1-W)(1-W)(1-W)
This value is essential for us to calculate following 3 interested probabilities:
  • Probability of only X wins
  • Probability of at most X wins
  • Probability of at least X wins

Step 3: Probability of only X wins

Generally, we just need to pay attentions on 3 scenarios before we can deduce a formula to fit P(only x wins)
  • P(0 win)
  • P(Only 1 win)
  • P(Only 2 wins)

Step 3.1: P(0 win)

Probability for 0 wins equals to the Lose rate. IE, you lose 3 consecutive times since the tournament start.
P(0wins) = L

Step 3.2: P(Only 1 win)

You get 1 win, and lose 3 times.
P(wlll) = LW
Recall the combination theory you learn from secondary school, you can also get following combinations for achieving 1 win 3 loses.
  • wlll
  • lwll
  • llwl
  • lllw
The last one (lllw), however, is an invalid combination due to facts above. So, we need to cross out it. Finally, the result will be like this:
P(1w3l) = p(wlll) + p(lwll) + p(llwl) = WL * 3

Step 3.3: P(Only 2 wins)

  • Firstly, we need to calculate the probability of wwlll first.
P(wwlll) = LWW
  • Then, we just need to sum up valid combinations of 2 wins 3 loses in order to get the result we need.
P(wwlll) = LWW * Z, where Z is number of valid combinations.
The question is how to determine Z? Luckily, we can rely on Combinations Calculator (nCr) to do the mathematics for us.
The combinations of 5 matches with 3 loses = 5C3 = 10
  • Thirdly, we need to minus invalid combinations (LLL, LWLL, LLWL, LLLW) from above combinations:
Number of valid combinations = 5C3 - 4 = 5C3 - 4C3
  • Finally, we have following results
P(2W3L) = WWLLL (5C3 - 4C3) = WWLLL 6

Step 3.4: P(Only X wins)

It is obvious that there are 2 factors in the formula:
  • First part is the x win in N
  • Second part is how many of valid combinations for x win in N
The first part can be deduced as L(W)^x.
The second part can be deduced as (N)C3 - (N-1)C3 since number of invalid combinations for N matches equals to the valid combination for (N - 1) matches.
Finally, we have
P(Only x win) = L(W)^x * ((x+3)C3 - (x+3-1)C3)

Step 4: P(at most X wins)

Since we know P(only x wins), we can calculate probability of at most X wins by following:
P(at most 0 win) = P(0 win)
P(at most x win) = P(only x win) + P(at most (x-1) win), for x ranges from 1 to 20

Step 5: P(at least X wins)

Since we know P(at most x win), and P(only x win), we can calculate P(at least X wins) as following:
P(at least 0 win) = 1
P(at least x win) = P(at least (x-1) wins) - P(only (x-1) win)
Congratulation. That’s the end of the calculation.

At last

If you can read up to here, thanks for the reading. I hope you enjoy the path for solving the probabilities by ourselves especially that no ones want to explain just like the comments in the original forum post.

2020/08/01

Use python's pathlib to implement file path across OS

Why file path is an issue across platform

Nowadays, there are 3 common OS, which are Windows, Linux, and MacOS. Since Linux, and MacOS shares same file path’s implementation, developers do not need to worry about file path issues on between them. The problem is between Windows, and others.
In Windows, file path can be expressed in either 2 formats:
  • \folder1\folder2\file1.txt (Window format)
  • /folder1/folder2/file1.txt (POSIX format)
The difference is the direction of the slash. A side note is that file path in Linux / MacOS is expressed in POSIX format ONLY.
Actually, if applications in Windows can eat them both properly, I don’t need to write this blog since I can simply written in 2nd format at all. According to my experiences, even applications in Windows accepts either one of them only. It is a try and error process for picking a correct format for each applications.
Back to the title of this blog, file path issue will be bigger if your implementation runs across platform. For example, a library (Python zipfile library) may accepts POSIX format but the format of input file path may be Window format at all. So, the question is whether python provide library for us to do this conversion or not.
A side note on python zipfile. This library accepts both kind of formats since py3.8

pathlib

Pathlib is a builtin library for us to fix this problem. Below demonstrates 2 ways on conversion via such library. For other advance usage, please consult the library here.

Get POSIX file path from different input format

import pathlib
winPath = r'\workspace\xxx\test_fixture\user-restore-success.zip'
posixPath = '/workspace/xxx/test_fixture/user-restore-success.zip'
pWIN = pathlib.PureWindowsPath(winPath)
pPOSIX = pathlib.PureWindowsPath(posixPath)
pWIN.as_posix()
#'/workspace/xxx/test_fixture/user-restore-success.zip'
pPOSIX.as_posix()
#'/workspace/xxx/test_fixture/user-restore-success.zip'

Get Window file path

str(pWIN)
str(pPOSIX)

Notes

  • Always favor PureWindowsPath when doing conversion
PureWindowsPath is able to convert between Window path, and POSIX path. PurePosixPath, on the other hand, is not able to do so since Backslash () is a valid filename in POSIX path.
For detail, please refer to the discussions here
  • Bug in Path.resolve() on Windows platform
According to here, Path.resolve() in windows cannot return the absolute file path if such file is not existed at first. If you need a reliable way to get absolute path of a file right now, use os.path.abspath instead

2020/04/26

A gotcha on python's round method (Banker's rounding)

Before studying the gotcha, let’s have a quiz, and see whether you will fall into the trap on round() or not.
Try to round below 22 float numbers to nearest integer, and see whether you can get them all correct. Below is the quiz in python.
floatNumbers = [1.1, 1.2, 1.3, 1.4, 1.49, 1.5, 1.51, 1.6, 1.7, 1.8, 1.9]
roundNumbers = list(map(round, floatNumbers))
formatFloatNumbers = ['%.02f' % num for num in floatNumbers]
formatRoundNumbers = ['%.02f' % num for num in roundNumbers]
print('Q', formatFloatNumbers)
print('A', formatRoundNumbers)

floatNumbers = [2.1, 2.2, 2.3, 2.4, 2.49, 2.5, 2.51, 2.6, 2.7, 2.8, 2.9]
roundNumbers = list(map(round, floatNumbers))
formatFloatNumbers = ['%.02f' % num for num in floatNumbers]
formatRoundNumbers = ['%.02f' % num for num in roundNumbers]
print('Q', formatFloatNumbers)
print('A', formatRoundNumbers)
Below is the answer.
Q ['1.10', '1.20', '1.30', '1.40', '1.49', '1.50', '1.51', '1.60', '1.70', '1.80', '1.90']
A ['1.00', '1.00', '1.00', '1.00', '1.00', '2.00', '2.00', '2.00', '2.00', '2.00', '2.00']
Q ['2.10', '2.20', '2.30', '2.40', '2.49', '2.50', '2.51', '2.60', '2.70', '2.80', '2.90']
A ['2.00', '2.00', '2.00', '2.00', '2.00', '2.00', '3.00', '3.00', '3.00', '3.00', '3.00']
The gotcha is on 2.50 case. Result of rounding 2.50 will be 2.00 instead of 3.00. And, this is a default, and correct behavior on rounding.

Why?

This rounding behavior is named as Round half to even
In short, this is a limitation from hardware. Since I am not going to go through the hardware’s limitation here, I suggest you to read this to have a full picture on them.
In fact, I would like to focus on another naming of this rounding: Banker's Rounding

Banker’s Rounding

Beside surprised by the behavior, I am also surprised on another name of this rounding technique, which is Banker's Rounding.
As the name stated, it is a rounding used by banks. So, why would banks adopt this strange rounding at all? Interestingly, this is because of Fairness.
Earning money is a job of banks (and for all companies). But, banks need to do it in a legal, and fair way. Rounding is essential on bank’s deals. For example, time deposits, and credit card’s loan etc. To round up / round down a number to the nearest integer, there are 9 cases (from 0.1 to 0.9).
Numbers will be evenly distributed among these 9 cases in statistical point of view. Banks round down numbers in 0.1 ~ 0.4 while round up numbers in 0.6 ~ 0.9. Chances for banks paying more or lesser is even (fairness) for these 8 cases. The problematic case is 0.5. If banks round up on 0.5, it will pay much more (5 cases), investors must be angry about that. Vice versa, if banks round down on 0.5, clients must be mad about that.
As a result, banks adopt round half to even to fix this fairness issues. After adopting this rounding, a single 0.5 case will be divided into 2 cases (even or odd cases). Banks will have equal chances on rounding up / down numbers. That’s why it is also named as Banker's rounding
IMO, this is a really clever trick to make every parties happy.

References

2020/04/25

Color matters! Enable semantic Highlighting in your IDE NOW

Color matters

As software developers, we read, and write codes every work day. Code with color highlighting is one of critical features provided by IDE.
Recently, there is a new technique semantic highlighting provided by IDEs (vscode, pycharm, etc), which is an advance version of syntax highlighting. I strongly recommend readers to enable it NOW. (I got this technique from this blog post)

Syntax highlighting

Firstly, let me try to convince you why color matters on dealing with codes by showing the syntax highlighting feature on / off.
Below is a snippet captured from Django RegexValidator.
With syntax highlighting No Color
As the feature name syntax highlighting suggested, we can easily tell variables, keywords, and class names from sea of codes.
IMO, I will be dead if there is no syntax highlighting when reading codes.

Semantic highlighting

While syntax highlighting colorizes codes based on coding syntax, Semantic highlighting colorizes codes based on the semantic. Take a look the comparison below if you are not sure the differences between them.
With syntax highlighting With semantic highlighting
In syntax highlighting, variables are all in same color while they have an unique color in semantic highlighting.
Variables are same type of syntax. Therefore, they share same color in syntax highlighting. But, in semantic point of view, they are differences. Therefore, each of them has unique color.
IMO, semantic highlighting is far more useful than the syntax highlighting. By reading the colors, you can tell the scope of a variable.
This is VERY critical for writing / debugging codes. Given a sea of variables inside a function, if they are in same color, you need to understand each of them to find out which one is wrong. With semantic highlighting, your work maybe easier by simply reading colors inside that function.
Honestly, the visual impact between two highlighting is smaller than the one between color on / off. But, I am still strongly recommend you to enable semantic highlighting NOW.
By the help of semantic highlighting, I am more confident to write correct codes at the first time.

Let’s enable semantic highlighting. But how?

Pycharm

Since I am a fan of pycharm, below is the manual for enabling semantic highlighting

Other IDEs

Modern IDEs nowaday have a search function. You can find the switch by just typing semantic highlighting inside that search function.

Customize the color scheme

To save our times on configuring coding colors, there are feature named as color scheme, which pre-configure colors for you. Instead of setup from scratch, I recommend readers to start from picking color themes.
There are lots of builtin color themes for you to select in modern IDEs. If none of them fulfill what you need, you can download color themes, as a plugin, from other developers.
Personally, I am using a color theme from GAP
Of cause, you can still fine tune the colors in each color scheme in your IDE settings.

Tips on testing colors in PYCHARM

  • Switching theme quickly
By hotkey ctrl+`, you can invoke a color menu for you to switch color theme.

References

2020/01/18

Kindle books from simplified Chinese to traditional Chinese

Why having this blog

As you may not know, online books in traditional Chinese are much fewer than simplified Chinese. Luckily, we can translate simplified Chinese, or vice versa, to traditional Chinese since, typically, they are just different representation. This is a blog post for recording the studied I have made for this translations.
To be honest, this is just a blog post for summarizing links, and blogs in the same place.

Method one: Convert the book directly

For readers able to read Chinese, please go to this link

Install Calibre

Download Plugin

Install the plugin into Calibre

Guides on using that plugin

  • Follow instructions in operation section
  • Follow instructions from the attachments

Upload your translated book

Once your book is translated, you can send them via Calibre directly. For me, I plug in my kindle, via USB, to my desktop. Then, click send file to the device in Calibre menu. Done.

Method two: Upgrade kindle, and translate them natively

For readers able to read Chinese, please go to this link

Select Firmware from amazon

  • To support translation natively, firmware version must be at least 5.9.6
  • Usually, you can upgrade your firmware via kindle’s menu

Video steps for upgrading firmware manually

Install fonts

  • Unzip the zip file
  • Connect your kindle to the desktop via USB
  • Put one of the ttf under font folder in your kindle

Change language

  • Open your ebook in kindle
  • Change the language in the display menu
  • enjoy

Notes

None of my ebook can rely on this translation. According to the guide from amazon, you are not allowed to use the custom font if your ebook does not support. I have, however, unable to find a way to confirm this requirement from my ebook.
Method 2 is just a record for my study.

Useful links

Epub 2 mobi

azw 2 epub