whisper.cpp

Author	SHA1	Message	Date
Georgi Gerganov	514cd04452	whisper : fix bug in prompt processing (close #705 ) Was dereferencing a dangling pointer	2023-04-14 19:17:07 +03:00
Georgi Gerganov	69b8503935	ggml : backport llama.cpp updates (close #709 ) - About x2 overall performance improvement on Apple Silicon - Results should now be the same for different number of threads (not tested)	2023-04-10 22:28:54 +03:00
pajowu	0a2d1210bc	whisper : add progress callback (#600 )	2023-03-30 20:29:29 +03:00
Jhen-Jie Hong	eefed45e37	whisper : add initial_prompt param (#645 )	2023-03-29 23:23:23 +03:00
Georgi Gerganov	42c6855103	whisper : bump "large" scratch buffer even mode (close #671 )	2023-03-28 10:50:49 +03:00
Georgi Gerganov	0be9cd3497	whisper : increase scratch buffers after recent change (#671 ) Should fix the error: ggml_new_tensor_impl: not enough space in the scratch memory	2023-03-28 10:36:16 +03:00
Georgi Gerganov	4a0deb8b1e	talk-llama : add new example + sync ggml from llama.cpp (#664 ) * talk-llama : talk with LLaMA AI * talk.llama : disable EOS token * talk-llama : add README instructions * ggml : fix build in debug	2023-03-27 21:00:32 +03:00
Georgi Gerganov	8e361d90d7	whisper : disable fallbacks until the performance is improved (#588 )	2023-03-22 22:34:39 +02:00
sandrohanea	d4fa0d92ad	fixed language auto-detection for state provided processing (#627 ) Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>	2023-03-22 21:47:09 +02:00
Leo Moll	8fcd1a3b32	main : provide option for creating JSON output (#615 ) * examples : provide option for exporting also as JSON file (ggerganov/whisper.cpp#614) * main : remove leftovers --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-22 21:37:36 +02:00
Georgi Gerganov	1beff6f66d	models : change HF hosting from dataset to model	2023-03-22 20:44:56 +02:00
Takeshi Inoue	09e9068007	whisper.android : support benchmark for Android example. (#542 ) * whisper.android: Support benchmark for Android example. * whisper.android: update screenshot in README. * update: Make text selectable for copy & paste. * Update whisper.h to restore API name Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * whisper.android: Restore original API names. --------- Co-authored-by: tinoue <tinoue@xevo.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-07 21:36:30 +02:00
sandrohanea	59fdcd19c8	whisper : add whisper_state + default state on the whisper_context (#523 ) * Added whisper state + default state on the whisper_context * Fixed some examples and bindings * Fixed whisper_n_len (which was used in some binding) and added whisper_n_len_from_state * Fixed comments * whisper : reuse kv_cache_free() and fix compiler warnings * whisper : clean-up the API comments --------- Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-05 21:42:19 +02:00
Georgi Gerganov	478289a4b3	whisper : set no_context == true by default (#537 )	2023-03-05 20:53:43 +02:00
Georgi Gerganov	373043cabe	whisper : zero-initialize some more context variables Just in case	2023-02-21 19:00:42 +02:00
Finn Voorhees	fb4d0d470f	whisper : fix uninitialized exp_n_audio_ctx	2023-02-21 18:58:08 +02:00
Georgi Gerganov	0d229163bb	whisper : add API for applying custom logits filters during decoding	2023-02-19 18:35:01 +02:00
Georgi Gerganov	a94897bcde	whisper : by default disable non-speech tokens suppression (#473 ) This seems to be causing hallucinations in the end of the audio, e.g.: "Thank you for listening" "Amen" ..	2023-02-15 21:48:49 +02:00
shikokuchuo	0336161b7d	whisper : fix signedness compiler warning (#506 )	2023-02-15 19:08:25 +02:00
shibukazu	cfc06bf8df	whisper : suppress non-speech-related token outputs (#473 ) * add non-speech-token suppression * add suppress non-speech_tokens param	2023-02-08 09:05:34 +02:00
sandrohanea	2bfe0ebc0f	whisper : fixed Beam Search Strategy and exposed whisper_pcm_to_mel_phase_vocoder (#474 ) Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>	2023-02-08 09:01:47 +02:00
boolemancer	4dd7119deb	whisper : only trim if split_on_word is true (#476 )	2023-02-08 08:43:23 +02:00
kamranjon	a1c1583cc7	whisper : add whisper_full_lang_id() for getting the context lang (#461 )	2023-02-05 14:46:26 +02:00
Matija Pevec	d012b5c7e4	whisper : add "split_on_word" flag when using using "max_len" option (#455 ) * Update whisper.cpp * fix: trim function * feat: added flag to split on word * fix: arguments for main	2023-02-05 14:44:23 +02:00
Georgi Gerganov	f3ee4a9673	whisper : reduce memory usage during inference (#431 ) * ggml : add "scratch" buffer support * ggml : support for scratch ring-buffer * ggml : bug fix in ggml_repeat() * ggml : error on scratch buffer overflow * whisper : use scratch buffers during inference (base model only) * whisper : update memory usage for all models * whisper : fix encoder memory usage * whisper : use whisper_context functions instead of macros * whisper : fix FF + remove it from README * ggml : reuse ggml_new_i32 * ggml : refactor the scratch buffer storage * whisper : reorder scratch buffers in the decoder * main : add option to disable temp fallback * Update README.md	2023-02-04 09:45:52 +02:00
Georgi Gerganov	291980369c	whisper : suppress task tokens (#442 )	2023-02-04 09:03:14 +02:00
Georgi Gerganov	b992f3709e	whisper : do not provide past prompt when n_max_text_ctx == 0	2023-01-25 20:01:00 +02:00
Georgi Gerganov	b5ddb16ec7	whisper : condition timestamps to be monotonically increasing (#425 )	2023-01-23 20:48:26 +02:00
fitzsim	ae16c21e9c	whisper : PPC64 big-endian support (#398 ) * ggml : set cache line size to 128 on POWER9 * whisper : add PPC64 big endian support	2023-01-23 20:48:10 +02:00
Georgi Gerganov	78f166174f	whisper : fix condition for providing past prompt (critical) This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to #419	2023-01-22 10:47:01 +02:00
Georgi Gerganov	21c569ba4a	whisper : extend information in whisper_print_timings()	2023-01-19 18:50:33 +02:00
Georgi Gerganov	1a91c19af9	whisper : perform entropy check only when we have at least 32 tokens (#412 )	2023-01-18 22:52:18 +02:00
Georgi Gerganov	a6cf6f4c4a	bench : minor fixes	2023-01-18 21:40:10 +02:00
Georgi Gerganov	1ccb8a46a5	bench : fix Windows linkage by moving ggml benches in whisper lib ..	2023-01-18 21:19:50 +02:00
Georgi Gerganov	8088a977af	whisper : fix possible uninitialized variables (#291 )	2023-01-16 21:44:40 +02:00
Georgi Gerganov	00ea21668b	whisper : account speed_up flag for short audio (close #405 )	2023-01-15 12:42:15 +02:00
Georgi Gerganov	8de452c18b	Improve decoding (#291 ) * whisper : prepare infra for new decoding strategies * whisper : apply logit filters and compute logprobs * whisper : add whisper_get_logits() * whisper : separate self and cross attention memory Initial step needed for supporting parallel decoders * whisper : move probs_id buffer to whisper_context * whisper : refactor kv cache into separate struct * whisper : move self-attention kv cache to whisper_decoder * whisper : wip decoding parameters + strategies * whisper : wip decoding parameters + strategies (part 2) * whisper : wip decoding parameters + strategies (part 3) * whisper : wip decoding parameters + strategies (part 4) * whisper : fix prompt_past update to not include prompt_init * whisper : temperature + best_of support * whisper : support for compression_ration_threshold We actually use entropy, but it is similar * command : fix example to use logits instead of obsolete probs * whisper : handle empty sequence ranking * whisper : add WHISPER_DEBUG + diagnostic prints + new main args * whisper : minor fixes * whisper : add beam-search support * whisper : bug fix when there no previous context * whisper : add comments * stream : disable temperature fallback For real-time processing, we always want a single decoder running at T=0 * whisper.swiftui : update example - fix paths + add empty folders	2023-01-15 11:29:57 +02:00
Georgi Gerganov	4ef3398e8f	ggml : remove obsolete zeroing + comment fixes (#390 )	2023-01-08 20:21:03 +02:00
boolemancer	08dc705a69	whisper : fix sample_to_timestamp calculation with 64 bit precision to avoid overflow (#388 ) * Do calculation with 64 bit precision to avoid overflow * Update whisper.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-01-08 15:08:45 +02:00
Syahmi Azhar	1512545149	whisper : add loader class to allow loading from buffer and others (#353 ) * whisper : add loader to allow loading from other than file * whisper : rename whisper_init to whisper_init_from_file * whisper : add whisper_init_from_buffer * android : Delete local.properties * android : load models directly from assets * whisper : adding <stddef.h> needed for size_t + code style Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-01-08 13:03:33 +02:00
Georgi Gerganov	65fdcbbbbb	whisper : revert accidental MB change	2023-01-07 16:18:21 +02:00
Georgi Gerganov	d61d55cd4b	ggml : speed-up soft max via Accelerate + unroll	2023-01-07 16:16:42 +02:00
Abitofevrything	a62170c656	ggml : add SSE3 and fp16 conversion lookup table (#368 ) * Improves WASM performance: On MacBook M1 Pro, I observe 25% faster using Firefox and 35% faster using Chrome * Add support for SSE3 SIMD * Add SSE3 to system information * Add Imath support for fp16-fp32 conversions * Add Imath to system information * Wrap Imath calls to avoid static function warnings * Drop Imath; Add lookup table for f16 -> f32 conversions * Remove TODO comments * Update SSE3 to new macro arguments * Correct updated macro definitions * Prefer static inline where possible * ggml : static inlines + add public f16 <-> f32 conversions Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-01-06 18:45:59 +02:00
Thomas Fitzsimmons	1944e7c33e	whisper : document POWER VSX support	2023-01-05 23:53:00 +02:00
Georgi Gerganov	ad2a4ffa03	whisper : do not use F16 tensors when in F32 mode (#369 )	2023-01-05 22:56:25 +02:00
Andy Maloney	dd6d582977	whisper : use ranged-based for loops for readability	2023-01-05 21:20:44 +02:00
Georgi Gerganov	d51c5eb906	ggml : define MIN / MAX only if not defined (minor)	2023-01-05 21:16:52 +02:00
Georgi Gerganov	d97e6005e9	whisper : add whisper_n_audio_ctx and check for invalid audio_ctx closes #344	2022-12-31 09:57:19 +02:00
Georgi Gerganov	68daf6e487	whisper : avoid some memory allocations	2022-12-30 13:43:48 +02:00
Georgi Gerganov	ac521a566e	ggml : simplify the SIMD code (#324 ) * ggml : simplify the SIMD code * ggml : generic reduce for all register sizes + comments	2022-12-24 10:22:28 +02:00
Andy Maloney	543bd5627e	whisper : use emplace_back in place of push_back (#319 ) This avoids potential construction of temporaries.	2022-12-23 11:07:19 +02:00
Andy Maloney	62fee9a9cc	whisper : fix mem leak on failure to load model (#318 )	2022-12-23 11:06:17 +02:00
Andy Maloney	fa463313ad	minor : small code cleanups (#302 ) * Small code cleanups - fix indentation - remove extra semicolons - remove extra break after returns in case statements - remove unnecessary call to .data() on string - use empty() instead of checking size() - no need to check for nullptr before free - remove unnecessary initialization of string to "" * minor : switch case always break Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2022-12-22 17:06:19 +02:00
Georgi Gerganov	501a6b455c	minor : flag "ARM FMA" -> "ARM_FMA"	2022-12-22 16:47:54 +02:00
Kevin Brothaler	e1432dd91a	Check for both __ARM_NEON and __ARM_FEATURE_FMA so that the project can be compiled for armv7a. Android armeabi-v7a's NEON support doesn't support FMA unless configured with `-mfpu=neon-fp-armv8`, which would need runtime checks. * Also removed ABI filter from Android project.	2022-12-22 16:47:54 +02:00
Andy Maloney	42c6730732	whisper : use nullptr (C++11) instead of NULL macro (#299 )	2022-12-22 16:35:18 +02:00
Georgi Gerganov	99da1e5cc8	cmake : enable and fix -Wall -Wextra -Wpedantic C++ warnings	2022-12-19 20:45:08 +02:00
Matheus de Sousa	8e3f129b4d	minor : resolves some of warnings when compiling with clang/clang++ (#294 ) * Resolves some of warnings when compiling with clang/clang++ Mostly nit stuff that clang catches when compiling with -Wall -Wextra -pedantic. - Fix comparison between sign/unsigned integers. - Passes a constant reference (const&) instead of copying each time. * minor : normalize coding style * minor : fix warning Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2022-12-19 20:19:01 +02:00
Georgi Gerganov	fba10a4c68	whisper : language auto-detect (#59 )	2022-12-17 18:49:44 +02:00
Georgi Gerganov	6a69e3ae27	command : adding guided mode	2022-12-16 19:38:18 +02:00
Georgi Gerganov	bf69b669a0	whisper : add whisper_tokenize() Tokenizes a string into a list of vocabulary tokens	2022-12-16 19:38:18 +02:00
Georgi Gerganov	6a7c82501e	whisper : improve decoding strategy (#244 ) - Clear past prompt when there is very short audio left for processing. My observation is that in these cases the decoding tends to repeat and hallucinate stuff and I think this is induced by the existing prompt - When we fail to sample timestamp token, retry by clearing the past prompt. If it fails again, then we advance the window by 1 second	2022-12-16 18:34:35 +02:00
Georgi Gerganov	124c718c73	whisper : fix UB when reading buffer of length 0 bytes (#265 )	2022-12-13 23:14:47 +02:00
Roland Rabien	e70d47baab	Remove C++20 requirement (#257 ) * Remove C++20 requirement * Roll back C features not supported in VS2017	2022-12-11 20:03:07 +02:00
bert hubert	d1da35de06	fix potential bug reading model data into a small size optimized string which could lead to memory corruption. In an SSO string, you can't write data to &str[0] and expect it to work well. Also added a small wrapper function to more safely read model data without having to get the sizeof right. I tested this on tiny, base and large models, there was no change in behaviour.	2022-12-10 16:20:48 +02:00
Georgi Gerganov	603f97ba11	whisper : minor improvemnt in decoding strategy (#244 ) Do not allow for text segments to go beyond end of audio. This partially mitigates some issues when the last audio window is 1-2 seconds just before the end of the audio file and the decoding spirals into a repetition of the last transcribed phrase.	2022-12-10 13:38:26 +02:00
Georgi Gerganov	f8ec718b76	ggml : add F16C CPU flag check	2022-12-06 21:56:56 +02:00
Georgi Gerganov	78d13257be	Try to improve the token sampling strategy (#193 ) * whisper : try to improve the token sampling strategy - Add the "max_initial_timestaamp" token logic from OpenAI - Disallow sampling timestamps that are in the past * whisper : fix the max initial timestamp logic + fallback decoding	2022-12-02 21:51:50 +02:00
Georgi Gerganov	4698dcdb52	whisper : add mechanism for aborting the whisper_full() computation	2022-11-27 20:42:45 +02:00
Georgi Gerganov	e266cb0723	whisper.objc : add real-time processing (#97 ) Similar to the "stream" app	2022-11-26 18:32:46 +02:00
Georgi Gerganov	c207eed431	whisper.objc : fix build warnings	2022-11-26 16:27:04 +02:00
Georgi Gerganov	be16dfa038	whisper.wasm : do not block page while processing (close #86 )	2022-11-25 23:07:42 +02:00
Georgi Gerganov	b8ce25dec1	refactoring : more readable code	2022-11-25 19:28:04 +02:00
Georgi Gerganov	128aaadb93	whisper : improve printfs	2022-11-24 17:54:16 +02:00
katsu560	83456076f0	add AVX support	2022-11-23 22:16:33 +02:00
Georgi Gerganov	49706a658a	minor : updates few prints + fix buttons in whisper.wasm	2022-11-23 17:19:21 +02:00
Georgi Gerganov	385236d1d3	stream : "-kc" now enables context keeping from previous segment (#90 ) By default, the context keeping is disabled	2022-11-22 18:21:15 +02:00
M. Eren Akbiyik	63ae03b8e0	Prompt previous tokens for streaming (#163 ) * feat: prompt previous tokens for streaming I used a vector pointer instead of vector itself because it gave weird errors, and why not * convert vector to use with C api * feat: remove old refs, check for prompt size * feat: use better way of getting the pointer	2022-11-22 18:10:35 +02:00
Georgi Gerganov	a4dfbeecf9	talk.wasm : GPT-2 meets Whisper in WebAssembly (#155 ) * talk : initial real-time transcription in the browser * talk : polishing the UI * talk : ready for beta testing * talk.wasm : rename example	2022-11-21 22:20:42 +02:00
Georgi Gerganov	fb8d77f760	stream : add "audio_ctx" parameter Used to overwrite the audio context size of the Encoder. For example, setting "audio_ctx = 512" will make it run about 3 times faster, processing about 10s of audio, instead of 30s. The transcription quality drops, but this can be used for real-time streaming purposes where performance is important.	2022-11-20 21:22:41 +02:00
Georgi Gerganov	62b5ff875c	stream : add "max_tokens" parameter Used to limit the number of tokens in a segment. Useful to battle with word repetition when using partial encoder context	2022-11-20 21:22:41 +02:00
Georgi Gerganov	d351771a4b	stream : add "single_segment" option Force the entire audio chunk to be transcribed into a single segment	2022-11-20 21:22:41 +02:00
Georgi Gerganov	c058aaf22e	stream : partial encoder experiments	2022-11-20 21:22:41 +02:00
greeshmay	2ba66360c9	fix: free ggml_context (close #149 ) (#150 ) * fix: free ggml_context * ggml : free the model's contexts in whisper_free() Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2022-11-17 22:12:51 +02:00
Georgi Gerganov	83c742f1a7	whisper : add option to speed up the audio tempo by x2 Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example.	2022-11-13 16:25:43 +02:00
Georgi Gerganov	c30bffc8a5	ref #22 : add "duration" option Can be used to partially process a recording	2022-11-07 20:14:52 +02:00
Georgi Gerganov	d5afebd37c	whisper : token-level timestamp refactoring (#49 , #120 ) This turned out pretty good overall. The algorithm has been moved from main.cpp to whisper.cpp and can be reused for all subtitles types. This means that now you can specify the maximum length of the generated lines. Simply provide the "-ml" argument specifying the max length in number of characters	2022-11-02 21:45:54 +02:00
Georgi Gerganov	02dfd5b8c3	whisper : fix extra memory usage after recent processor changes Had increased the memory buffer to the size of the model and forgot to bring it down.	2022-11-02 18:31:18 +02:00
Georgi Gerganov	57fb46f307	main : add option for word-leve timestamps (very experimental)	2022-10-30 17:06:57 +02:00
Georgi Gerganov	eba62e0fa1	close #113 : fix struct whisper_token_data	2022-10-30 08:23:52 +02:00
Georgi Gerganov	014a119052	minor : fix multiple definitions of to_timestamp()	2022-10-29 19:37:19 +03:00
Georgi Gerganov	dec40be58f	parallel : print time of audio boundaries + fix timings	2022-10-29 19:37:19 +03:00
Georgi Gerganov	0b2dc3c82c	parallel : working	2022-10-29 19:37:19 +03:00
Georgi Gerganov	85d6e1e1e7	main : fix sampling time + add max_context parameter	2022-10-29 19:37:19 +03:00
Georgi Gerganov	72e9cdd6bf	parallel : adding tool for parallel transformer inference	2022-10-29 19:37:19 +03:00
Borislav Stanimirov	c565c569e7	Define WHISPER_BUILD so as to export symbols on Windows	2022-10-29 13:23:09 +03:00
Georgi Gerganov	34bb3ab0cf	ggml : add system info functions	2022-10-25 20:53:48 +03:00
Georgi Gerganov	5f7e9fa2dc	ref #68 , #79 : fix segment time output	2022-10-23 13:30:30 +03:00
Georgi Gerganov	7affd309d3	whisper : add new-segment callback Can be used to process new segments as they are being generated. Sample usage in main, for printing the resulting segments during the inference.	2022-10-22 21:17:21 +03:00
Georgi Gerganov	31ff0c6a1f	wip : experimental color coding of tokens based on probabilities	2022-10-22 21:17:21 +03:00
Georgi Gerganov	8d15a1c635	ci : fix and re-enable tests (2nd try)	2022-10-21 15:57:20 +03:00
Georgi Gerganov	692aa0784f	Revert "ci : fix and re-enable tests" This reverts commit `80aefc9514`.	2022-10-21 15:36:19 +03:00
Georgi Gerganov	80aefc9514	ci : fix and re-enable tests	2022-10-21 15:27:30 +03:00
Georgi Gerganov	7eeef0358a	ref #52 : improve greedy sampling strategy Force timestamp token to be sampled if the probability sum over all timestamp tokens is above the probability of any other token	2022-10-18 19:48:15 +03:00
Georgi Gerganov	e30cf83158	ref #57 , #62 , #63 : remove unions in C-api + remove designated initializers We are not ready for designated initializers - many compilers do not support this C++ feature yet, so removing it's non-trivial usages.	2022-10-18 18:17:24 +03:00
Georgi Gerganov	d6b84b2a23	ref #62 : fix build for some compilers For some reason, new version of GCC panic when the struct type is not specified explicitly	2022-10-18 10:57:03 +03:00
Georgi Gerganov	b4a3875b2c	Revert recent sampling change It does not actually help and seems to produce worse results on some of the samples	2022-10-18 08:26:16 +03:00
Georgi Gerganov	cf67bfffa0	Fix EOT token handling If it is the end of the audio, pick all sampled tokens. Otherwise, print error message.	2022-10-18 00:53:06 +03:00
Georgi Gerganov	d14823582d	Try to improve the sampling strategy a bit It sill fails sometimes when it does not sample a timestamp token for the entire segment. We now print a message in such cases	2022-10-18 00:12:51 +03:00
Georgi Gerganov	20d8e7a309	Fix memory sizes	2022-10-18 00:12:51 +03:00
Georgi Gerganov	72d967bce4	Use Accelerate framework on Apple silicon Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro) Also various extra optimizations: - Multi-threaded NORM operator - Faster GELU via F16 cast	2022-10-18 00:12:51 +03:00
Georgi Gerganov	0ad085f5e8	ref #48 : clear results at the start of whisper_full This way, even if the input audio is empty, the previous results will be removed.	2022-10-15 09:55:28 +03:00
0/0	b799226973	check if spectogram length is <100 before doing anything else fixes #39	2022-10-12 07:32:42 +03:00
Borislav Stanimirov	0b45d25151	Building with MSVC	2022-10-11 21:40:46 +03:00
Georgi Gerganov	63b6786767	Minor	2022-10-10 22:06:27 +03:00
lnyan	4bbb8a587b	Add MinGW support	2022-10-09 22:26:37 +08:00
Georgi Gerganov	2ca8cc77b2	ref #17 : print whisper logs to stderr Only the transcribed/translted text is printed to stdout. This way, one can redirect the result to a file.	2022-10-08 17:28:06 +03:00
Georgi Gerganov	8c7c018893	ref #17 : add options to output result to file Support for: - plain text - VTT - SRT	2022-10-08 17:22:22 +03:00
Georgi Gerganov	b43b36e006	Update tests	2022-10-08 11:43:42 +03:00
Georgi Gerganov	2f069335ab	Adding sanitizer tests	2022-10-08 11:43:42 +03:00
Georgi Gerganov	332c9d77fe	whisper : fix bug in token sampling logic Could overflow buffer	2022-10-08 09:02:41 +03:00
Georgi Gerganov	481cd685d5	ref #10 : option to keep context in "stream" example Seems the results become worse when we keep the context, so by default this is not enabled	2022-10-07 22:30:44 +03:00
Georgi Gerganov	7787b878e1	ref #16 , #22 : add "offset" argument Allows to start processing the input audio at some offset from the beginning. Useful for splitting a long job into multiple tasks.	2022-10-07 22:00:40 +03:00
Georgi Gerganov	167324584b	wip : rpi4 support	2022-10-05 23:03:46 +03:00
Georgi Gerganov	ce1fe95902	wip : improve makefile	2022-10-05 23:03:46 +03:00
Georgi Gerganov	6814cc9b02	Improve result printing	2022-10-04 23:18:15 +03:00
Georgi Gerganov	eba33adadd	Extend C-style API with full inference methods	2022-10-04 23:18:15 +03:00
Georgi Gerganov	6b77124e01	Initial C-style interface for whisper.cpp	2022-10-04 23:18:15 +03:00

1 2 3 4 5

228 Commits