Generation of linguistic APIs based on audio or video resources (analysis)