Extracting diagnoses and investigation results from unstructured text in electronic health records by semi-supervised machine learning