Intrepid Universe Logo

Machine Learning - Pitch Detection

Published 4 January 2023
Updated 18 March 2026

IU Home > Projects > ml-pitch-detection
A pitch detection system based on machine learning technology
  • Input - continuous monophonic audio signal at 44.1kHZ at 16 bit
  • ANN Architecture - Use an ANN specifically an RNN with LSTM or GRU trained using Adam.
  • Layer (or should it simply be number of parameters) sized (ignore nyquist condition as included in sample rate) to capture longest wave form probably need a filter layer.
  • Use RELU or a softmax output so there is a nice linear output.
  • If each cell (4 weights) is a filter we could initialise forget weights to pick out frequencies rather than just being random.

( 1 4 44100 27.5 ) = 400.9

( 1 4 88 ) = 22

  • Training Data - computer generated audio at various frequencies and phase shift possibly include harmonics and different wave forms. From A0 (27.5Hz) to B8 (7902.13Hz)
  • Output - continuous value proportional to the frequency of the input signal
  • Test data set - tuning fork, guitar strings, violin strings

Bibliography

  1. A beginners Guide to LSTMs and Recurrent Neural Networks