National Anthem of USSR

A piano recording from my dorm XD

I hope you enjoy it.

Nelf Name Generator V2

Introduction

I have made a name generator based on LSTM-LM in the 2016 post. The quality is ok for novel-writing if you would like to spend some time to look for a satisfying name in the generated list.

But considers the situation that you are designing a game to give unique names to every random NPCs, the quality of Nelf Name Generator V1 cannot meet the requirement without supervision.

I am looking for new technologies to refine the generator. The Nelf Name Generator V2 is based on WGAN-GP, referenced from Improved Training of Wasserstein GANs.

Wasserstein GAN

GAN is for Generative Adversarial Network. It constitutes a Discriminator (D) and a Generator (G). To train the network, first use G to generate samples y', and mix y' with real samples y. Let D to discriminate samples if it belongs to y' (generated samples) or y (real samples), and train it with errors. Derivate D to y' as the gradient of y', which indicates the direction to let y' be “more realistic”. Apply ∂D/∂y' to G to train with back propagation, so that the generated samples move along with the “realistic” direction. After several epochs, the network arrives at an equilibrium: D cannot discriminate generated samples and real samples, ∂D/∂y' convergences to zero, and G no longer changes. At this moment, the generated samples have the same probability distribution as real samples.

Wasserstein GAN is proposed in the breaking paper Wasserstein GAN by Martin Arjovsky in the early-2017. There is a detailed interpretation in Zhihu. The paper discussed the reason why GAN is hard to train. The mechanism behind GANs is to minimize the J-S distance or K-L distance between the distribution of samples. However, the distributions are low-dimensional manifolds in high-dimensional space, where the overlapped volume is often zero. The J-S distance and K-L distance lose their functions in this circumstance. The improved method is to use Wasserstein distance instead, which could give directions even when the overlapped volume of distributions is zero.

The following paper of WGAN-GP is cooperated with Martin Arjovsky, published two months later. It revised one of the improving method (weight clipping) to a more reasonable one (gradient penalty).

Thanks to the excellent traits of WGAN, this is the very first time that people achieve text generating with GANs. In other GANs, the discriminator will discriminate real samples directly according to the value of one-hot encoding, so that it cannot provide any effective directions to the generator. WGAN will keep pulling the distance of two probability distributions closer even when it can discriminate with one-hot encoding already.

Implement

I did not write codes. Just pulled the code from the original author of the paper and modified a little bit. The code is here. Since the corpus is very simple I decreased the space of parameters in the model to prevent over-fitting. The original model used 5 residual blocks with 512 dimensions of features, and I decreased them to 2 residual blocks and 32 dimensions.

The training data is extracted from a private server database of patch 7.35 (TrinityCore), combined with a database client from patch 7.25. I queried names of all female night elf NPCs and removed other parts except first names manually. There are 661 names in total.

Result

Left: Results from LSTM-LM (Nelf Name Generator V1). Middle: Random names in the client, prepared by Blizzard. Right: Results from WGAN-GP (Nelf Name Generator V2).

LSTM-LM Blizzard WGAN-GP
Liir Aqulais Alysna
Kyula Selwynn Myshaina
Aarael Alayia Saeurdore
Censa’oh Alasia Lilly
Salciea Elybrook Ishawnn
Mleharite Alaria Eloria
Aarnail Rochelle Novo
Sltthandris Ivy Jalena
Lashera Elessaria Falandria
‘yiuamaliana Mavralais Jayanna
Gorallia Aria Tyranna
Cieia Edelinn Shyela
Derelien Syyia Asy’ia
Ly’ura Brinna Lyanis
Tira Elyria Sniela
Kllyoana Adila Silra
Nvla Caylbrooke Aeya
Juraia Dara Hesteral
Flara Saelda Ella’dria
Titianna Annalore Cinls
Yeainsiy Elyda Laurne
Aasephine Kynlea Dulvian
Ahmnnai Cybelle Aelysea
Reanl Arlana Ranelao
Eeyra Saellea Roow
Dalllyn Shaulea Ea’yssia
Myirill Laana Lunura
Lelytha Saebrooke Allanya
Kylda Kynreith Leana
Myiuaa Syreith Kinda
Mini Lada Nauianaa
Ahynysil Catalin Syli’nna
Jolania Mavraena Chellsane
Alilune Belinna Arlysea
Tynytha Syda Illay
Clyraste Alareith Csana

A list containing 64,000 generated names (possibly duplicated) is here. The left column is the generated names, the right column is the evaluation given by discriminator. It may help naming new characters in creation colleagues.

Nelf Name Generator

I have taken a look into LSTM-RNN for the works of ICT&Cambricon. I found the LSTM language model is kind of attractive to me.

I have built a name generator for Nelf female. Export all Nelf female NPC names from a private server database(Arkcore NG, 4.3.2), and delete last names and titles manually, just save all first names. There are 498 first names at total(maybe have few duplicates).

Run a python script to generate a random sequence of 200,000 names as training data. Build a mini LSTM network with 2 layers, each has 1000 neurons, and a 50% drop rate after each layer.


name: "lstm_language_model"
layer {
  name: "data"
  type: "HDF5Data"
  top: "cont_sentence"
  top: "input_sentence"
  top: "target_sentence"
  include { phase: TRAIN }
  hdf5_data_param {
    source: "hdf5_list.txt"
    batch_size: 200
  }
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "cont_sentence"
  top: "input_sentence"
  top: "target_sentence"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "hdf5_list.txt"
    batch_size: 1
  }
}
layer {
  name: "lstm1"
  type: "LSTM"
  bottom: "input_sentence"
  bottom: "cont_sentence"
  top: "lstm1"
  recurrent_param {
    num_output: 1000
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "lstm1-drop"
  type: "Dropout"
  bottom: "lstm1"
  top: "lstm1"
  dropout_param { dropout_ratio: 0.5 }
  include { stage: "lstm-drop" }
}
layer {
  name: "lstm2"
  type: "LSTM"
  bottom: "lstm1"
  bottom: "cont_sentence"
  top: "lstm2"
  recurrent_param {
    num_output: 1000
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "lstm2-drop"
  type: "Dropout"
  bottom: "lstm2"
  top: "lstm2"
  dropout_param { dropout_ratio: 0.5 }
  include { stage: "lstm-drop" }
}
layer {
  name: "predict"
  type: "InnerProduct"
  bottom: "lstm2"
  top: "predict"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 29
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    axis: 2
  }
}
layer {
  name: "cross_entropy_loss"
  type: "SoftmaxWithLoss"
  bottom: "predict"
  bottom: "target_sentence"
  top: "cross_entropy_loss"
  loss_weight: 20
  loss_param {
    ignore_label: -1
  }
  softmax_param {
    axis: 2
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "predict"
  bottom: "target_sentence"
  top: "accuracy"
  include { phase: TEST }
  accuracy_param {
    axis: 2
    ignore_label: -1
  }
}

While training, feed names as input and move names one character left as target output, i.e. input S-H-A-N-D-R-I-S, target H-A-N-D-R-I-S-‘\n’. The network will learn to predict what the next character will most probably be according to the context above. And if we direct the network’s predict output back to the input, the network will loops with its prediction and generate a list of names.

The result is really amazing. The sheet below shows two columns of names, one is generated by the LSTM network, and another is given by Blizzard staffs. Could you recognize which column is generated by the machine?

ClathielClarindrela
DorianaDahlia
ElanndiaElessaria
KulanaiKynreith
LanandrisLaria
Shyn'telShauana
TylariaTarindrella

C to HDL

Found a simple & generic way to translate programs into RTL. This used to be one of the hardest problems in my thought several years ago.

I came to this solution independently in April 2016, only known Xilinx/Altera have solved it recently. Unfortunately, many people have found and implemented this decades ago.

If you are looking for ideas about it, this example may help: c2hdl.

WoW Dictionary

Did some study about NLP.

We are planning to build a patch note bot, to publish patch changes to the forum automatically. I am working on the tooltips differences comparing algorithm.

Compare differences between strings are just simple with dynamic programming. If you don’t know how to do this just read this slide. That algorithm works on the granularity of alphabets. To scale the granularity to words, we need to do word split before we start this algorithm.

Splitting Latin words is easy. It won’t be more than 5 lines of code.

However, it is hard to split Chinese words. Billions of Chinese Ph.D. students graduate each year, by spamming all kinds of word segmentation methods. We need a dictionary before doing word segmentation. A dictionary for WoW.

I extracted all strings from DBC and filtered out zhCN corpus. Then implemented this algorithm.

The dictionary I got is ok for me. There are still garbage words in it, but rare. Some words are filtered out because they are rarely used in tooltips. E.g. “灰谷“(Ashenvale, I’m really unhappy about this :X) UPD 2016/4/13: Ashenvale is back now.

I would make my dictionary extractor open when the bot is ready. If you are looking for the dictionary too, you could get it from my github.