No Love for Negative Permissions – DAC/ACL Bypass on Linux
14 by Deeg9rie9usi | 3 comments on Hacker News.
Thursday, August 31, 2023
Wednesday, August 30, 2023
Tuesday, August 29, 2023
Monday, August 28, 2023
Sunday, August 27, 2023
Saturday, August 26, 2023
Friday, August 25, 2023
Thursday, August 24, 2023
Wednesday, August 23, 2023
New top story on Hacker News: In deadly Maui fires, those who dodged barricades survived
In deadly Maui fires, those who dodged barricades survived
44 by mutant_glofish | 22 comments on Hacker News.
44 by mutant_glofish | 22 comments on Hacker News.
New top story on Hacker News: Show HN: Gentrace – evaluation and observability for generative AI
Show HN: Gentrace – evaluation and observability for generative AI
17 by dsaffy | 0 comments on Hacker News.
Hi HN, Gentrace is our new evaluation and observability tool for generative AI (open beta). Generative pipelines are hard to evaluate because outputs are subjective. Lots of developers end up just doing “gut checks” on a few inputs before shipping changes, or they build up a spreadsheet of test cases that they manually run through the pipeline. Some companies outsource filling out the spreadsheet. However, in any of these cases, you end up with a very slow and expensive process for evaluation. At one point, we did this too. Gentrace is the result of a pivot; it was an internal tool we used to automatically grade new PRs as developers shipped changes to generative pipelines that other people thought might be useful. Gentrace makes pre-production testing of generative pipelines continuous and nearly instantaneous. In Gentrace, you: - Import and/or construct suites of test data - Use a combination of AI and heuristic evaluators to grade for quality, hallucination, safety, etc - Use our interface to correct automated grades or add your own (yourself or a member of your team) Gentrace integrates at a code level for evaluation, meaning we test your generative AI pipeline the way you would test normal code. This allows you to test more than just prompt changes; for example, you can compare models (eg Claude 2 vs GPT-4 vs GPT 3.5 vs Llama 2) or see the effects of additional chained steps (”Rewrite the previous answer in the following tone:”). Here’s a video overview that goes into a bit more detail: https://youtu.be/XxgDPSrTWIw In production, Gentrace observes for speed, cost, and data flow. It also shows real user feedback as well. We do this by integrating via our SDK at a code level; Gentrace does not proxy requests. Soon, we’ll allow you to convert production data into test cases, allowing customer support to turn bad production generations into “failing tests” for AI teams to make pass. We process interim steps and multiple outputs as well, helping evaluate agent flows / chains where the “last output” isn’t always the only thing that matters. There’s been a lot of observability tools published recently. We differ from those by focusing more strongly on blending observability with strong evaluation and by using an SDK rather than a “man-in-the-middle” approach to capturing data (ie Gentrace can be down and your request to OpenAI will still succeed). Within the evaluation landscape, we differentiate by integrating with code (see above for benefits) for capturing generative outputs and by providing a customizable UI workflow for building evaluators. In Gentrace, you start with off-the-shelf automated evaluators and then customize them to your specific task. You also build and run new evaluators on old generative outputs. Finally, you easily override automated evaluators and/or blend automated evaluation with evaluation by humans on your team. We also focus on being suitable for business use. We are SOC 2 Type 1 compliant (Type 2 coming shortly), have robust legal documentation around data processing, security, and privacy, and have already passed several vendor legal and security reviews at large technology companies. Our standard usage-based pricing is available on the website: https://ift.tt/InDw8Kq If you are building features with generative AI, we would love to get your feedback. You can self-serve sign up (without a credit card) for a 14 day trial here: https://gentrace.ai/ We’re available right here for feedback and questions. We’re also available at support@gentrace.ai. Best, Doug, Vivek, and Daniel
17 by dsaffy | 0 comments on Hacker News.
Hi HN, Gentrace is our new evaluation and observability tool for generative AI (open beta). Generative pipelines are hard to evaluate because outputs are subjective. Lots of developers end up just doing “gut checks” on a few inputs before shipping changes, or they build up a spreadsheet of test cases that they manually run through the pipeline. Some companies outsource filling out the spreadsheet. However, in any of these cases, you end up with a very slow and expensive process for evaluation. At one point, we did this too. Gentrace is the result of a pivot; it was an internal tool we used to automatically grade new PRs as developers shipped changes to generative pipelines that other people thought might be useful. Gentrace makes pre-production testing of generative pipelines continuous and nearly instantaneous. In Gentrace, you: - Import and/or construct suites of test data - Use a combination of AI and heuristic evaluators to grade for quality, hallucination, safety, etc - Use our interface to correct automated grades or add your own (yourself or a member of your team) Gentrace integrates at a code level for evaluation, meaning we test your generative AI pipeline the way you would test normal code. This allows you to test more than just prompt changes; for example, you can compare models (eg Claude 2 vs GPT-4 vs GPT 3.5 vs Llama 2) or see the effects of additional chained steps (”Rewrite the previous answer in the following tone:”). Here’s a video overview that goes into a bit more detail: https://youtu.be/XxgDPSrTWIw In production, Gentrace observes for speed, cost, and data flow. It also shows real user feedback as well. We do this by integrating via our SDK at a code level; Gentrace does not proxy requests. Soon, we’ll allow you to convert production data into test cases, allowing customer support to turn bad production generations into “failing tests” for AI teams to make pass. We process interim steps and multiple outputs as well, helping evaluate agent flows / chains where the “last output” isn’t always the only thing that matters. There’s been a lot of observability tools published recently. We differ from those by focusing more strongly on blending observability with strong evaluation and by using an SDK rather than a “man-in-the-middle” approach to capturing data (ie Gentrace can be down and your request to OpenAI will still succeed). Within the evaluation landscape, we differentiate by integrating with code (see above for benefits) for capturing generative outputs and by providing a customizable UI workflow for building evaluators. In Gentrace, you start with off-the-shelf automated evaluators and then customize them to your specific task. You also build and run new evaluators on old generative outputs. Finally, you easily override automated evaluators and/or blend automated evaluation with evaluation by humans on your team. We also focus on being suitable for business use. We are SOC 2 Type 1 compliant (Type 2 coming shortly), have robust legal documentation around data processing, security, and privacy, and have already passed several vendor legal and security reviews at large technology companies. Our standard usage-based pricing is available on the website: https://ift.tt/InDw8Kq If you are building features with generative AI, we would love to get your feedback. You can self-serve sign up (without a credit card) for a 14 day trial here: https://gentrace.ai/ We’re available right here for feedback and questions. We’re also available at support@gentrace.ai. Best, Doug, Vivek, and Daniel
Tuesday, August 22, 2023
Monday, August 21, 2023
Sunday, August 20, 2023
Saturday, August 19, 2023
Friday, August 18, 2023
Thursday, August 17, 2023
Wednesday, August 16, 2023
Tuesday, August 15, 2023
Monday, August 14, 2023
Sunday, August 13, 2023
Saturday, August 12, 2023
Friday, August 11, 2023
Thursday, August 10, 2023
Wednesday, August 9, 2023
Tuesday, August 8, 2023
Monday, August 7, 2023
Sunday, August 6, 2023
Saturday, August 5, 2023
Friday, August 4, 2023
Thursday, August 3, 2023
Wednesday, August 2, 2023
Tuesday, August 1, 2023
Subscribe to:
Posts (Atom)