Nelf Name Generator

I have taken a look into LSTM-RNN for the works of ICT&Cambricon. I found the LSTM language model is kind of attractive to me.

I have built a name generator for nelf female. Export all nelf female NPC names from a private server database(Arkcore NG, 4.3.2), and delete last names and titles manually, just save all first names. There is 498 first names at total(maybe have few duplicates).

Run a python script to generate random sequence of 200,000 names as training data. Build a mini LSTM network with 2 layers, each have 1000 neurons, and 50% drop rate after each layer.


name: "lstm_language_model"
layer {
  name: "data"
  type: "HDF5Data"
  top: "cont_sentence"
  top: "input_sentence"
  top: "target_sentence"
  include { phase: TRAIN }
  hdf5_data_param {
    source: "hdf5_list.txt"
    batch_size: 200
  }
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "cont_sentence"
  top: "input_sentence"
  top: "target_sentence"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "hdf5_list.txt"
    batch_size: 1
  }
}
layer {
  name: "lstm1"
  type: "LSTM"
  bottom: "input_sentence"
  bottom: "cont_sentence"
  top: "lstm1"
  recurrent_param {
    num_output: 1000
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "lstm1-drop"
  type: "Dropout"
  bottom: "lstm1"
  top: "lstm1"
  dropout_param { dropout_ratio: 0.5 }
  include { stage: "lstm-drop" }
}
layer {
  name: "lstm2"
  type: "LSTM"
  bottom: "lstm1"
  bottom: "cont_sentence"
  top: "lstm2"
  recurrent_param {
    num_output: 1000
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "lstm2-drop"
  type: "Dropout"
  bottom: "lstm2"
  top: "lstm2"
  dropout_param { dropout_ratio: 0.5 }
  include { stage: "lstm-drop" }
}
layer {
  name: "predict"
  type: "InnerProduct"
  bottom: "lstm2"
  top: "predict"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 29
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    axis: 2
  }
}
layer {
  name: "cross_entropy_loss"
  type: "SoftmaxWithLoss"
  bottom: "predict"
  bottom: "target_sentence"
  top: "cross_entropy_loss"
  loss_weight: 20
  loss_param {
    ignore_label: -1
  }
  softmax_param {
    axis: 2
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "predict"
  bottom: "target_sentence"
  top: "accuracy"
  include { phase: TEST }
  accuracy_param {
    axis: 2
    ignore_label: -1
  }
}

While training, feed names as input and move names one character left as target output, i.e. input S-H-A-N-D-R-I-S, target H-A-N-D-R-I-S-‘\n’. The network will learn to predict what the next character will most probabily be according to the context above. And if we direct the network’s predict output back to the input, the network will loops with it’s prediction and generate a list of names.

The result is really amazing. The sheet below shows two columns of names, one is generated by the LSTM network, and another is given by blizzard staffs. Could you recognize which column is generated by machine?

ClathielClarindrela
DorianaDahlia
ElanndiaElessaria
KulanaiKynreith
LanandrisLaria
Shyn'telShauana
TylariaTarindrella