Java8 Steam流太難用了？那你可以試試 JDFrame

2024-04-30碼農

來源：juejin.cn/post/7356652717392740404

👉 歡迎，你將獲得: 專屬的計畫實戰 / Java 學習路線 / 一對一提問 / 學習打卡 / 贈書福利

全棧前後端分離部落格計畫 2.0 版本完結啦， 演示連結 ： http://116.62.199.48/ ， 新計畫正在醞釀中 。全程手摸手，後端 + 前端全棧開發，從 0 到 1 講解每個功能點開發步驟，1v1 答疑，直到計畫上線。 目前已更新了239小節，累計38w+字，講解圖：1645張，還在持續爆肝中.. 後續還會上新更多計畫，目標是將Java領域典型的計畫都整一波，如秒殺系統, 線上商城, IM即時通訊，Spring Cloud Alibaba 等等，

0、簡介

由於經常記不住stream的一些API每次要復制來復制去並且又長又臭，想要更加語意化的api，於是想到了以前寫大數據Spark pandnas 等DataFrame模型時的API，然後發現其實也存在java的JVM層的DataFrame模型比如 tablesaw，joinery

但是他們得寫死去指定欄位名，這對於有程式碼潔癖的人實在難以忍受，而且我只是簡單統計下數據，我想在一些場景下能不能使用匿名函式去指定的欄位處理去處理，於是便有了這個

一個jvm層級的仿DataFrame工具，語意化和簡化java8的stream流式處理工具

1、快速開始

1.1、引入依賴

<dependency> <groupId>io.github.burukeyou</groupId> <artifactId>jdframe</artifactId> <version>0.0.2</version> </dependency>

1.2、案例

統計每個學校的裏學生年齡不為空並且年齡在9到16歲間的合計分數，並且獲取合計分前2名的學校

static List<Student> studentList = new ArrayList<>(); static { studentList.add(new Student(1,"a","一中","一年級",11, new BigDecimal(1))); studentList.add(new Student(2,"a","一中","一年級",11, new BigDecimal(1))); studentList.add(new Student(3,"b","一中","三年級",12, new BigDecimal(2))); studentList.add(new Student(4,"c","二中","一年級",13, new BigDecimal(3))); studentList.add(new Student(5,"d","二中","一年級",14, new BigDecimal(4))); studentList.add(new Student(6,"e","三中","二年級",14, new BigDecimal(5))); studentList.add(new Student(7,"e","三中","二年級",15, new BigDecimal(5))); } // 等價於SQL: // select school,sum(score) // from students // where age is not null and age >=9 and age <= 16 // group by school // order by sum(score) desc // limit 2 SDFrame<FI2<String, BigDecimal>> sdf2 = SDFrame.read(studentList) .whereNotNull(Student::getAge) .whereBetween(Student::getAge,9,16) .groupBySum(Student::getSchool, Student::getScore) .sortDesc(FI2::getC2) .cutFirst(2); sdf2.show();

輸出資訊：

c1 c2 三中 10 二中 7 @Data @AllArgsConstructor @NoArgsConstructor public class Student { private int id; private String name; private String school; private String level; private Integer age; private BigDecimal score; private Integer rank; public Student(String level, BigDecimal score) { this.level = level; this.score = score; } public Student(int id, String name, String school, String level, Integer age, BigDecimal score) { this.id = id; this.name = name; this.school = school; this.level = level; this.age = age; this.score = score; } }

2、API案例

2.1、矩陣資訊相關

void show(int n); // 打印矩陣資訊到控制台 List<String> columns(); // 獲取矩陣的表頭欄位名 List<R> col(Function<T, R> function); // 獲取矩陣某一列值 T head(); // 獲取第一個元素 List<T> head(int n); // 獲取前n個元素 T tail(); // 獲取最後一個元素 List<T> tail(int n); // 獲取後n個元素

2.2、篩選相關

SDFrame.read(studentList) .whereBetween(Student::getAge,3,6) // 過濾年齡在[3，6]歲的 .whereBetweenR(Student::getAge,3,6) // 過濾年齡在(3，6]歲的, 不含3歲 .whereBetweenL(Student::getAge,3,6) // 過濾年齡在[3，6)歲的, 不含6歲 .whereNotNull(Student::getName) // 過濾名字不為空的數據，相容了空字串''的判斷 .whereGt(Student::getAge,3) // 過濾年齡大於3歲 .whereGe(Student::getAge,3) // 過濾年齡大於等於3歲 .whereLt(Student::getAge,3) // 過濾年齡小於3歲的 .whereIn(Student::getAge, Arrays.asList(3,7,8)) // 過濾年齡為3歲或者7歲或者 8歲的數據 .whereNotIn(Student::getAge, Arrays.asList(3,7,8)) // 過濾年齡不為為3歲或者7歲或者 8歲的數據 .whereEq(Student::getAge,3) // 過濾年齡等於3歲的數據 .whereNotEq(Student::getAge,3) // 過濾年齡不等於3歲的數據 .whereLike(Student::getName,"jay") // 模糊查詢，等價於 like "%jay%" .whereLikeLeft(Student::getName,"jay") // 模糊查詢，等價於 like "jay%" .whereLikeRight(Student::getName,"jay"); // 模糊查詢，等價於 like "%jay"

2.3、匯總相關

JDFrame<Student> frame = JDFrame.read(studentList); Student s1 = frame.max(Student::getAge);// 獲取年齡最大的學生 Integer s2 = frame.maxValue(Student::getAge); // 獲取學生裏最大的年齡 Student s3 = frame.min(Student::getAge);// 獲取年齡最小的學生 Integer s4 = frame.minValue(Student::getAge); // 獲取學生裏最小的年齡 BigDecimal s5 = frame.avg(Student::getAge); // 獲取所有學生的年齡的平均值 BigDecimal s6 = frame.sum(Student::getAge); // 獲取所有學生的年齡合計 MaxMin<Student> s7 = frame.maxMin(Student::getAge); // 同時獲取年齡最大和最小的學生 MaxMin<Integer> s8 = frame.maxMinValue(Student::getAge); // 同時獲取學生裏最大和最小的年齡

2.4、去重相關

原生steam只支持物件去重，不支持按特定欄位去重

List<Student> std = null; std = SDFrame.read(studentList).distinct().toLists(); // 根據物件hashCode去重 std = SDFrame.read(studentList).distinct(Student::getSchool).toLists(); // 根據學校名去重 std = SDFrame.read(studentList).distinct(e -> e.getSchool() + e.getLevel()).toLists(); // 根據學校名拼接級別去重復 std =SDFrame.read(studentList).distinct(Student::getSchool).distinct(Student::getLevel).toLists(); // 先根據學校名去除重復再根據級別去除重復

2.5、簡單分組聚合相關

類似sql的 group by語意簡化處理分組和聚合的邏輯，如果用原生stream需要寫可能一大串邏輯。

JDFrame<Student> frame = JDFrame.from(studentList); // 等價於 select school,sum(age) ... group by school List<FI2<String, BigDecimal>> a = frame.groupBySum(Student::getSchool, Student::getAge).toLists(); // 等價於 select school,max(age) ... group by school List<FI2<String, Integer>> a2 = frame.groupByMaxValue(Student::getSchool, Student::getAge).toLists(); // 與 groupByMaxValue 含義一致，只是返回的是最大的值物件 List<FI2<String, Student>> a3 = frame.groupByMax(Student::getSchool, Student::getAge).toLists(); // 等價於 select school,min(age) ... group by school List<FI2<String, Integer>> a4 = frame.groupByMinValue(Student::getSchool, Student::getAge).toLists(); // 等價於 select school,count(*) ... group by school List<FI2<String, Long>> a5 = frame.groupByCount(Student::getSchool).toLists(); // 等價於 select school,avg(age) ... group by school List<FI2<String, BigDecimal>> a6 = frame.groupByAvg(Student::getSchool, Student::getAge).toLists(); // 等價於 select school,sum(age),count(age) group by school List<FI3<String, BigDecimal, Long>> a7 = frame.groupBySumCount(Student::getSchool, Student::getAge).toLists(); // (二級分組)等價於 select school,level,sum(age),count(age) group by school,level List<FI3<String, String, BigDecimal>> a8 = frame.groupBySum(Student::getSchool, Student::getLevel, Student::getAge).toLists(); // （三級分組）等價於 select school,level,name,sum(age),count(age) group by school,level,name List<FI4<String, String, String, BigDecimal>> a9 = frame.groupBySum(Student::getSchool, Student::getLevel, Student::getName, Student::getAge).toLists();

2.6、排序相關

簡化原生stream的排序方式，直接指定欄位即可，不用使用Comparator還要去關註升序還是降序

// 等價於 order by age desc SDFrame.read(studentList).sortDesc(Student::getAge); // 等價於 order by age desc, level asc SDFrame.read(studentList).sortDesc(Student::getAge).sortAsc(Student::getLevel); // 等價於 order by age asc SDFrame.read(studentList).sortAsc(Student::getAge); // 使用Comparator 排序 SDFrame.read(studentList).sortAsc(Comparator.comparing(e -> e.getLevel() + e.getId()));

2.7、連線矩陣相關

API列表

append(T t); // 等價於集合 add union(IFrame<T> other); // 等價於集合 addAll join(IFrame<K> other, JoinOn<T,K> on, Join<T,K,R> join); // 等價於 sql內連線 leftJoin(IFrame<K> other, JoinOn<T,K> on, Join<T,K,R> join); // 等價於sql左連線，如果左連線失敗，K值為null，需手動判斷 rightJoin(IFrame<K> other, JoinOn<T,K> on, Join<T,K,R> join); // 等價於sql右連線，如果右連線失敗，T值為null，需手動判斷

內連線例子：

System.out.println("======== 矩陣1 ======="); SDFrame<Student> sdf = SDFrame.read(studentList); sdf.show(20); // 獲取學生年齡在9到16歲的學學校合計分數最高的前10名 SDFrame<FI2<String, BigDecimal>> sdf2 = SDFrame.read(studentList) .whereNotNull(Student::getAge) .whereBetween(Student::getAge,9,16) .groupBySum(Student::getSchool, Student::getScore) .sortDesc(FI2::getC2) .cutFirst(10); System.out.println("======== 矩陣2 ======="); sdf2.show(); SDFrame<UserInfo> frame = sdf.join(sdf2, (a, b) -> a.getSchool().equals(b.getC1()), (a, b) -> { UserInfo userInfo = new UserInfo(); userInfo.setKey1(a.getSchool()); userInfo.setKey2(b.getC2().intValue()); userInfo.setKey3(String.valueOf(a.getId())); return userInfo; }); System.out.println("======== 連線後結果 ======="); frame.show(5);

打印資訊：

======== 矩陣1 ======= id name school level age score rank 1 a 一中一年級 11 1 2 a 一中一年級 11 1 3 b 一中一年級 12 2 4 c 二中一年級 13 3 5 d 二中一年級 14 4 6 e 三中二年級 14 5 7 e 三中二年級 15 5 ======== 矩陣2 ======= c1 c2 三中 10 二中 7 一中 4 ======== 連線後結果 ======= key1 key2 key3 key4 一中 4 1 一中 4 2 一中 4 3 二中 7 4 二中 7 5

類似於

select a.*,b.* from sdf a inner join sdf2 b on a.school = b.c1

2.8、其他

百分數轉換

// 等價於 select round(score*100,2) from student SDFrame<Student> map2 = SDFrame.read(studentList).mapPercent(Student::getScore, Student::setScore,2);

分區

將每個5個元素分成一個小集合，用於將大任務拆成小任務

List<List<Student>> t = SDFrame.read(studentList).partition(5).toLists();

生成序號

按照age排序，然後根據當前順序生成排序號到rank欄位（序號從0開始）

SDFrame.read(studentList) .sortDesc(Student::getAge) .addSortNoCol(Student::setRank) .show(30);

輸出資訊:

id name school level age score rank 7 e 三中二年級 15 5 0 5 d 二中一年級 14 4 1 6 e 三中二年級 14 5 2 4 c 二中一年級 13 3 3 3 b 一中三年級 12 2 4 1 a 一中一年級 11 1 5 2 a 一中一年級 11 1 6

生成排名號

按照age降序排序，然後根據當前順序生成排名號到rank欄位（排名從0開始）

與序號不同的是，排名是如果值相同認為排名一樣。

SDFrame<Student> df = SDFrame.read(studentList).addRankingSameColDesc(Student::getAge, Student::setRank); df.show(20);

輸出資訊

id name school level age score rank 7 e 三中二年級 15 5 1 5 d 二中一年級 14 4 2 6 e 三中二年級 14 5 2 4 c 二中一年級 13 3 3 3 b 一中一年級 12 2 4 1 a 一中一年級 11 1 5 2 a 一中一年級 11 1 5

補充條目

1、補充缺失的學校條目

// 所有需要的學校條目 List<String> allDim = Arrays.asList("一中","二中","三中","四中"); // 根據學校欄位和allDim比較去補充缺失的條目，缺失的學校按照ReplenishFunction生成補充條目作為結果一起返回 SDFrame.read(studentList).replenish(Student::getSchool,allDim,(school) -> new Student(school)).show();

輸出

id name school level age score rank 1 a 一中一年級 11 1 2 a 一中一年級 11 1 3 b 一中一年級 12 2 4 c 二中一年級 13 3 5 d 二中一年級 14 4 6 e 三中二年級 14 5 7 e 三中二年級 15 5 0 四中

2、分組補充組內缺失的條目

按照學校進行分組，匯總所有年級allDim. 然後與allDim比較補充每個分組內缺失的年級，缺失的年級按照 ReplenishFunction 生成補充條目

SDFrame.read(studentList).replenish(Student::getSchool,Student::getLevel,(school,level) -> new Student(school,level)).show(30);

輸出

id name school level age score rank 1 a 一中一年級 11 1 2 a 一中一年級 11 1 3 b 一中三年級 12 2 0 一中二年級 4 c 二中一年級 13 3 5 d 二中一年級 14 4 0 二中三年級 0 二中二年級 6 e 三中二年級 14 5 7 e 三中二年級 15 5 0 三中一年級 0 三中三年級

套用場景舉例：要求計算近兩年每個月的數據，但是數據的年月可能不全，這時就補充缺失的年月數據作為結果一起返回